diff mbox

[RFC] ath10k: silence firmware file probing warnings

Message ID 1468933237-5226-1-git-send-email-michal.kazior@tieto.com (mailing list archive)
State Changes Requested
Delegated to: Kalle Valo
Headers show

Commit Message

Michal Kazior July 19, 2016, 1 p.m. UTC
Firmware files are versioned to prevent older
driver instances to load unsupported firmware
blobs. This is reflected with a fallback logic
which attempts to load several firmware files.

This however produced a lot of unnecessary
warnings sometimes confusing users and leading
them to rename firmware files making things even
more confusing.

Hence use request_firmware_direct() which does not
produce extra warnings. This shouldn't really
break anything because most modern systems don't
rely on udev/hotplug helpers to load firmware
files anymore.

Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
---
 drivers/net/wireless/ath/ath10k/core.c     | 11 +++++------
 drivers/net/wireless/ath/ath10k/testmode.c |  5 ++++-
 2 files changed, 9 insertions(+), 7 deletions(-)

Comments

Stanislaw Gruszka July 21, 2016, 7:09 a.m. UTC | #1
On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
> Firmware files are versioned to prevent older
> driver instances to load unsupported firmware
> blobs. This is reflected with a fallback logic
> which attempts to load several firmware files.
> 
> This however produced a lot of unnecessary
> warnings sometimes confusing users and leading
> them to rename firmware files making things even
> more confusing.

This happens on kernels configured with
CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
but also 60 seconds delay before loading next firmware version.
For some reason RHEL kernel needs above config option, so this
patch is very welcome from my perspective.

Stanislaw
Emmanuel Grumbach July 21, 2016, 7:36 a.m. UTC | #2
On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
>> Firmware files are versioned to prevent older
>> driver instances to load unsupported firmware
>> blobs. This is reflected with a fallback logic
>> which attempts to load several firmware files.
>>
>> This however produced a lot of unnecessary
>> warnings sometimes confusing users and leading
>> them to rename firmware files making things even
>> more confusing.
>
> This happens on kernels configured with
> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
> but also 60 seconds delay before loading next firmware version.
> For some reason RHEL kernel needs above config option, so this
> patch is very welcome from my perspective.
>

Sorry for my ignorance but how does the firmware loading work if not
with udev's help? As you can imagine, iwlwifi is suffering from the
same problem and I would be interested in applying the same change,
but I'd love to understand a bit more :)
Stanislaw Gruszka July 21, 2016, 8:05 a.m. UTC | #3
On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
> > On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
> >> Firmware files are versioned to prevent older
> >> driver instances to load unsupported firmware
> >> blobs. This is reflected with a fallback logic
> >> which attempts to load several firmware files.
> >>
> >> This however produced a lot of unnecessary
> >> warnings sometimes confusing users and leading
> >> them to rename firmware files making things even
> >> more confusing.
> >
> > This happens on kernels configured with
> > CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
> > but also 60 seconds delay before loading next firmware version.
> > For some reason RHEL kernel needs above config option, so this
> > patch is very welcome from my perspective.
> >
> 
> Sorry for my ignorance but how does the firmware loading work if not
> with udev's help?

I'm not sure exactly, but I think kernel VFS layer is capable to copy
file data directly from mounted filesystem without user space helper.

> As you can imagine, iwlwifi is suffering from the
> same problem and I would be interested in applying the same change,
> but I'd love to understand a bit more :)

Yes, iwlwifi (and some other drivers) suffer from this. However this
happen when the newest firmware version is not installed on the system
and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
it's not common.

I started to see this currently, because that option was enabled on 
RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
happened because of that, i.e. thermal device was not functional
because f/w wasn't loaded due to big delay.

I'm not sure if replacing to request_firmware_direct() is a good
fix though. For example I can see this problem also on brcmfmac, which
use request_firmware_nowait(). I think I would rather prefer special
helper for firmware drivers that needs user helper and have
request_firmware() be direct as default.

Stanislaw
Prarit Bhargava July 21, 2016, 10:23 a.m. UTC | #4
On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
> On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
>> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
>>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
>>>> Firmware files are versioned to prevent older
>>>> driver instances to load unsupported firmware
>>>> blobs. This is reflected with a fallback logic
>>>> which attempts to load several firmware files.
>>>>
>>>> This however produced a lot of unnecessary
>>>> warnings sometimes confusing users and leading
>>>> them to rename firmware files making things even
>>>> more confusing.
>>>
>>> This happens on kernels configured with
>>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
>>> but also 60 seconds delay before loading next firmware version.
>>> For some reason RHEL kernel needs above config option, so this
>>> patch is very welcome from my perspective.
>>>
>>
>> Sorry for my ignorance but how does the firmware loading work if not
>> with udev's help?
> 
> I'm not sure exactly, but I think kernel VFS layer is capable to copy
> file data directly from mounted filesystem without user space helper.

Here's the situation: request_firmware() waits 60 seconds for udev to do its
loading magic via a "usermode helper".  This delay is there to allow, for
example, userspace to unpack or download a new firmware image or verify the
firmware image *in userspace* before providing it to the driver to apply to the HW.

Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
handshake on completion.

> 
>> As you can imagine, iwlwifi is suffering from the
>> same problem and I would be interested in applying the same change,
>> but I'd love to understand a bit more :)
> 
> Yes, iwlwifi (and some other drivers) suffer from this. However this
> happen when the newest firmware version is not installed on the system
> and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
> it's not common.

request_firmware_direct() was introduced at my request because (as you've
noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
periods of time when starting.  The bug that this introduced was a 60 second
delay per logical cpu when starting a system.  On a 64 cpu system that meant the
boot would complete in a little over one hour.

> 
> I started to see this currently, because that option was enabled on 
> RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
> happened because of that, i.e. thermal device was not functional
> because f/w wasn't loaded due to big delay.
> 
> I'm not sure if replacing to request_firmware_direct() is a good
> fix though. For example I can see this problem also on brcmfmac, which
> use request_firmware_nowait(). I think I would rather prefer special
> helper for firmware drivers that needs user helper and have
> request_firmware() be direct as default.
> 

The difference between request_firmware_direct() and request_firmware() is that
the _direct() version does not wait the 60 seconds for udev interaction.  The
only userspace check performed is to see if the file is there, and if the file
does exist it is provided to the driver to be applied to the hardware.

So the real question to ask here is whether or not the ath10k, brcmfmac, and
iwlwifi require udev to do anything beyond checking for the existence and
loading the firmware image.  If they don't, then it is better to use
request_firmware_direct().

P.

> Stanislaw
>
Stanislaw Gruszka July 21, 2016, 11:51 a.m. UTC | #5
(cc: firmware and brcmfmac maintainers)

On Thu, Jul 21, 2016 at 06:23:11AM -0400, Prarit Bhargava wrote:
> 
> 
> On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
> > On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
> >> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
> >>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
> >>>> Firmware files are versioned to prevent older
> >>>> driver instances to load unsupported firmware
> >>>> blobs. This is reflected with a fallback logic
> >>>> which attempts to load several firmware files.
> >>>>
> >>>> This however produced a lot of unnecessary
> >>>> warnings sometimes confusing users and leading
> >>>> them to rename firmware files making things even
> >>>> more confusing.
> >>>
> >>> This happens on kernels configured with
> >>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
> >>> but also 60 seconds delay before loading next firmware version.
> >>> For some reason RHEL kernel needs above config option, so this
> >>> patch is very welcome from my perspective.
> >>>
> >>
> >> Sorry for my ignorance but how does the firmware loading work if not
> >> with udev's help?
> > 
> > I'm not sure exactly, but I think kernel VFS layer is capable to copy
> > file data directly from mounted filesystem without user space helper.
> 
> Here's the situation: request_firmware() waits 60 seconds for udev to do its
> loading magic via a "usermode helper".  This delay is there to allow, for
> example, userspace to unpack or download a new firmware image or verify the
> firmware image *in userspace* before providing it to the driver to apply to the HW.
> 
> Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
> handshake on completion.
> 
> > 
> >> As you can imagine, iwlwifi is suffering from the
> >> same problem and I would be interested in applying the same change,
> >> but I'd love to understand a bit more :)
> > 
> > Yes, iwlwifi (and some other drivers) suffer from this. However this
> > happen when the newest firmware version is not installed on the system
> > and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
> > it's not common.
> 
> request_firmware_direct() was introduced at my request because (as you've
> noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
> periods of time when starting.  The bug that this introduced was a 60 second
> delay per logical cpu when starting a system.  On a 64 cpu system that meant the
> boot would complete in a little over one hour.
> 
> > 
> > I started to see this currently, because that option was enabled on 
> > RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
> > happened because of that, i.e. thermal device was not functional
> > because f/w wasn't loaded due to big delay.
> > 
> > I'm not sure if replacing to request_firmware_direct() is a good
> > fix though. For example I can see this problem also on brcmfmac, which
> > use request_firmware_nowait(). I think I would rather prefer special
> > helper for firmware drivers that needs user helper and have
> > request_firmware() be direct as default.
> > 
> 
> The difference between request_firmware_direct() and request_firmware() is that
> the _direct() version does not wait the 60 seconds for udev interaction.  The
> only userspace check performed is to see if the file is there, and if the file
> does exist it is provided to the driver to be applied to the hardware.
> 
> So the real question to ask here is whether or not the ath10k, brcmfmac, and
> iwlwifi require udev to do anything beyond checking for the existence and
> loading the firmware image.  If they don't, then it is better to use
> request_firmware_direct().

They don't need that, like 99% of the drivers I think, hence changing the
default seems to be more reasonable. However changing 3 drivers would work
for me as well, and that change do not introduce risk of broking drivers
that require udev fw download.

iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
use request_firmware_nowait(), so it first need to be converted to
ordinary request_firmware(), but this should be doable and I can do
that.

However I wonder if changing that will not broke the case when
driver is build-in in the kernel and f/w is not yet available when
driver start to initialize. Or maybe nowadays this is not the case
any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
images are build-in in the kernel or copied to initramfs?

Thanks
Stanislaw
Prarit Bhargava July 21, 2016, 12:01 p.m. UTC | #6
On 07/21/2016 07:51 AM, Stanislaw Gruszka wrote:
> (cc: firmware and brcmfmac maintainers)

<snip>

>>>
>>> I'm not sure if replacing to request_firmware_direct() is a good
>>> fix though. For example I can see this problem also on brcmfmac, which
>>> use request_firmware_nowait(). I think I would rather prefer special
>>> helper for firmware drivers that needs user helper and have
>>> request_firmware() be direct as default.
>>>
>>
>> The difference between request_firmware_direct() and request_firmware() is that
>> the _direct() version does not wait the 60 seconds for udev interaction.  The
>> only userspace check performed is to see if the file is there, and if the file
>> does exist it is provided to the driver to be applied to the hardware.
>>
>> So the real question to ask here is whether or not the ath10k, brcmfmac, and
>> iwlwifi require udev to do anything beyond checking for the existence and
>> loading the firmware image.  If they don't, then it is better to use
>> request_firmware_direct().
> 
> They don't need that, like 99% of the drivers I think, hence changing the
> default seems to be more reasonable. However changing 3 drivers would work

I think I argued for that a while back and changing the default was rejected.
I can't remember why it was rejected :(.  It may have had something to do with
the complexity of getting a large number of driver maintainers to ack the change.

> for me as well, and that change do not introduce risk of broking drivers
> that require udev fw download.

In my experience, there are very few drivers that actually require userspace
interaction beyond verifying the image location.  (The one that comes to mind is
the dell_rbu driver which attempts to download FW images)

> 
> iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
> use request_firmware_nowait(), so it first need to be converted to
> ordinary request_firmware(), but this should be doable and I can do
> that.
> 
> However I wonder if changing that will not broke the case when
> driver is build-in in the kernel and f/w is not yet available when
> driver start to initialize. 

As you say below ...

Or maybe nowadays this is not the case
> any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
> images are build-in in the kernel or copied to initramfs?

This is correct AFAIU.  If MODULE_FIRMWARE=y then the firmware should be loaded
into the kernel image temp fs and/or initramfs.  This of course assumes that the
person building the image is smart enough to have installed the FW on their system.

P.
Arend van Spriel July 22, 2016, 8:38 a.m. UTC | #7
+ Luis

On 21-7-2016 13:51, Stanislaw Gruszka wrote:
> (cc: firmware and brcmfmac maintainers)
> 
> On Thu, Jul 21, 2016 at 06:23:11AM -0400, Prarit Bhargava wrote:
>>
>>
>> On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
>>> On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
>>>> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
>>>>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
>>>>>> Firmware files are versioned to prevent older
>>>>>> driver instances to load unsupported firmware
>>>>>> blobs. This is reflected with a fallback logic
>>>>>> which attempts to load several firmware files.
>>>>>>
>>>>>> This however produced a lot of unnecessary
>>>>>> warnings sometimes confusing users and leading
>>>>>> them to rename firmware files making things even
>>>>>> more confusing.
>>>>>
>>>>> This happens on kernels configured with
>>>>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
>>>>> but also 60 seconds delay before loading next firmware version.
>>>>> For some reason RHEL kernel needs above config option, so this
>>>>> patch is very welcome from my perspective.
>>>>>
>>>>
>>>> Sorry for my ignorance but how does the firmware loading work if not
>>>> with udev's help?
>>>
>>> I'm not sure exactly, but I think kernel VFS layer is capable to copy
>>> file data directly from mounted filesystem without user space helper.
>>
>> Here's the situation: request_firmware() waits 60 seconds for udev to do its
>> loading magic via a "usermode helper".  This delay is there to allow, for
>> example, userspace to unpack or download a new firmware image or verify the
>> firmware image *in userspace* before providing it to the driver to apply to the HW.
>>
>> Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
>> handshake on completion.
>>
>>>
>>>> As you can imagine, iwlwifi is suffering from the
>>>> same problem and I would be interested in applying the same change,
>>>> but I'd love to understand a bit more :)
>>>
>>> Yes, iwlwifi (and some other drivers) suffer from this. However this
>>> happen when the newest firmware version is not installed on the system
>>> and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
>>> it's not common.
>>
>> request_firmware_direct() was introduced at my request because (as you've
>> noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
>> periods of time when starting.  The bug that this introduced was a 60 second
>> delay per logical cpu when starting a system.  On a 64 cpu system that meant the
>> boot would complete in a little over one hour.
>>
>>>
>>> I started to see this currently, because that option was enabled on 
>>> RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
>>> happened because of that, i.e. thermal device was not functional
>>> because f/w wasn't loaded due to big delay.
>>>
>>> I'm not sure if replacing to request_firmware_direct() is a good
>>> fix though. For example I can see this problem also on brcmfmac, which
>>> use request_firmware_nowait(). I think I would rather prefer special
>>> helper for firmware drivers that needs user helper and have
>>> request_firmware() be direct as default.
>>>
>>
>> The difference between request_firmware_direct() and request_firmware() is that
>> the _direct() version does not wait the 60 seconds for udev interaction.  The
>> only userspace check performed is to see if the file is there, and if the file
>> does exist it is provided to the driver to be applied to the hardware.
>>
>> So the real question to ask here is whether or not the ath10k, brcmfmac, and
>> iwlwifi require udev to do anything beyond checking for the existence and
>> loading the firmware image.  If they don't, then it is better to use
>> request_firmware_direct().
> 
> They don't need that, like 99% of the drivers I think, hence changing the
> default seems to be more reasonable. However changing 3 drivers would work
> for me as well, and that change do not introduce risk of broking drivers
> that require udev fw download.
> 
> iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
> use request_firmware_nowait(), so it first need to be converted to
> ordinary request_firmware(), but this should be doable and I can do
> that.

I am going bonkers here. This is the Nth time a discussion pops up on
firmware API usage. I stopped counting N :-( So the first issue was that
the INIT was taking to long as we were requesting firmware during probe
which was executed in the INIT context. So we added a worker and
register the driver from there. There was probably a reason for
switching to _no_wait() as well, but I do not recall the details. The
things is I don't know if I need user-space or not. I just need firmware
to get the device up and running. We have changed our driver a couple of
times now to accommodate something that in my opinion should have been
abstracted behind the firmware API in the first place and now here is
another proposal to change the drivers. Come on!

> However I wonder if changing that will not broke the case when
> driver is build-in in the kernel and f/w is not yet available when
> driver start to initialize. Or maybe nowadays this is not the case
> any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
> images are build-in in the kernel or copied to initramfs?

That is a nice idea, but I have not seen any change in that area. Could
have missed it.

Regards,
Arend

> Thanks
> Stanislaw
>
Stanislaw Gruszka July 22, 2016, 10:26 a.m. UTC | #8
On Fri, Jul 22, 2016 at 10:38:24AM +0200, Arend Van Spriel wrote:
> + Luis
> 
> On 21-7-2016 13:51, Stanislaw Gruszka wrote:
> > (cc: firmware and brcmfmac maintainers)
> > 
> > On Thu, Jul 21, 2016 at 06:23:11AM -0400, Prarit Bhargava wrote:
> >>
> >>
> >> On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
> >>> On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
> >>>> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
> >>>>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
> >>>>>> Firmware files are versioned to prevent older
> >>>>>> driver instances to load unsupported firmware
> >>>>>> blobs. This is reflected with a fallback logic
> >>>>>> which attempts to load several firmware files.
> >>>>>>
> >>>>>> This however produced a lot of unnecessary
> >>>>>> warnings sometimes confusing users and leading
> >>>>>> them to rename firmware files making things even
> >>>>>> more confusing.
> >>>>>
> >>>>> This happens on kernels configured with
> >>>>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
> >>>>> but also 60 seconds delay before loading next firmware version.
> >>>>> For some reason RHEL kernel needs above config option, so this
> >>>>> patch is very welcome from my perspective.
> >>>>>
> >>>>
> >>>> Sorry for my ignorance but how does the firmware loading work if not
> >>>> with udev's help?
> >>>
> >>> I'm not sure exactly, but I think kernel VFS layer is capable to copy
> >>> file data directly from mounted filesystem without user space helper.
> >>
> >> Here's the situation: request_firmware() waits 60 seconds for udev to do its
> >> loading magic via a "usermode helper".  This delay is there to allow, for
> >> example, userspace to unpack or download a new firmware image or verify the
> >> firmware image *in userspace* before providing it to the driver to apply to the HW.
> >>
> >> Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
> >> handshake on completion.
> >>
> >>>
> >>>> As you can imagine, iwlwifi is suffering from the
> >>>> same problem and I would be interested in applying the same change,
> >>>> but I'd love to understand a bit more :)
> >>>
> >>> Yes, iwlwifi (and some other drivers) suffer from this. However this
> >>> happen when the newest firmware version is not installed on the system
> >>> and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
> >>> it's not common.
> >>
> >> request_firmware_direct() was introduced at my request because (as you've
> >> noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
> >> periods of time when starting.  The bug that this introduced was a 60 second
> >> delay per logical cpu when starting a system.  On a 64 cpu system that meant the
> >> boot would complete in a little over one hour.
> >>
> >>>
> >>> I started to see this currently, because that option was enabled on 
> >>> RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
> >>> happened because of that, i.e. thermal device was not functional
> >>> because f/w wasn't loaded due to big delay.
> >>>
> >>> I'm not sure if replacing to request_firmware_direct() is a good
> >>> fix though. For example I can see this problem also on brcmfmac, which
> >>> use request_firmware_nowait(). I think I would rather prefer special
> >>> helper for firmware drivers that needs user helper and have
> >>> request_firmware() be direct as default.
> >>>
> >>
> >> The difference between request_firmware_direct() and request_firmware() is that
> >> the _direct() version does not wait the 60 seconds for udev interaction.  The
> >> only userspace check performed is to see if the file is there, and if the file
> >> does exist it is provided to the driver to be applied to the hardware.
> >>
> >> So the real question to ask here is whether or not the ath10k, brcmfmac, and
> >> iwlwifi require udev to do anything beyond checking for the existence and
> >> loading the firmware image.  If they don't, then it is better to use
> >> request_firmware_direct().
> > 
> > They don't need that, like 99% of the drivers I think, hence changing the
> > default seems to be more reasonable. However changing 3 drivers would work
> > for me as well, and that change do not introduce risk of broking drivers
> > that require udev fw download.
> > 
> > iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
> > use request_firmware_nowait(), so it first need to be converted to
> > ordinary request_firmware(), but this should be doable and I can do
> > that.
> 
> I am going bonkers here. This is the Nth time a discussion pops up on
> firmware API usage. I stopped counting N :-( So the first issue was that
> the INIT was taking to long as we were requesting firmware during probe
> which was executed in the INIT context. So we added a worker and
> register the driver from there. There was probably a reason for
> switching to _no_wait() as well, but I do not recall the details. The
> things is I don't know if I need user-space or not. I just need firmware
> to get the device up and running. We have changed our driver a couple of
> times now to accommodate something that in my opinion should have been
> abstracted behind the firmware API in the first place and now here is
> another proposal to change the drivers. Come on!

I understand you dislike that :-) Just to clarify the issue here:

Some drivers (including brcmfmac) request new firmware images, which are
not yet available (i.e. development F/W versions) and then fall-back
to older firmware version and works perfectly fine.

However with CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y configured, in case
of missing F/W image, request firmware involve user space helper and
waits 60s (loading_timeout value from drivers/base/firmware_class.c),
what delays creating network interface and confuse users.

For brcmfmac this looks like this:

[   15.160923] brcmfmac 0000:03:00.0: Direct firmware load for brcm/brcmfmac4356-pcie.txt failed with error -2
[   15.170759] brcmfmac 0000:03:00.0: Falling back to user helper
<snip>
[   75.709397] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Oct 22 2015 06:16:41 version 7.35.180.119 (r594535) FWID 01-1a5c4016
[   75.736941] brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code (0x30 0x30)

Without CONFIG_FW_LOADER_USER_HELPER_FALLBACK first firmware request
silently fail and then instantly next F/W image is loaded.

Another option to solve to problem would be stop requesting not
available publicly firmware. However, I assume some drivers would
like to preserve that option.

> > However I wonder if changing that will not broke the case when
> > driver is build-in in the kernel and f/w is not yet available when
> > driver start to initialize. Or maybe nowadays this is not the case
> > any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
> > images are build-in in the kernel or copied to initramfs?
> 
> That is a nice idea, but I have not seen any change in that area. Could
> have missed it.

I believe this is how the things are already done, IOW switching to
request_firmware_direct() in the driver should be no harm.

Stanislaw
Arend van Spriel July 22, 2016, 12:21 p.m. UTC | #9
On 22-7-2016 12:26, Stanislaw Gruszka wrote:
> On Fri, Jul 22, 2016 at 10:38:24AM +0200, Arend Van Spriel wrote:
>> + Luis
>>
>> On 21-7-2016 13:51, Stanislaw Gruszka wrote:
>>> (cc: firmware and brcmfmac maintainers)
>>>
>>> On Thu, Jul 21, 2016 at 06:23:11AM -0400, Prarit Bhargava wrote:
>>>>
>>>>
>>>> On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
>>>>> On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
>>>>>> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
>>>>>>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
>>>>>>>> Firmware files are versioned to prevent older
>>>>>>>> driver instances to load unsupported firmware
>>>>>>>> blobs. This is reflected with a fallback logic
>>>>>>>> which attempts to load several firmware files.
>>>>>>>>
>>>>>>>> This however produced a lot of unnecessary
>>>>>>>> warnings sometimes confusing users and leading
>>>>>>>> them to rename firmware files making things even
>>>>>>>> more confusing.
>>>>>>>
>>>>>>> This happens on kernels configured with
>>>>>>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
>>>>>>> but also 60 seconds delay before loading next firmware version.
>>>>>>> For some reason RHEL kernel needs above config option, so this
>>>>>>> patch is very welcome from my perspective.
>>>>>>>
>>>>>>
>>>>>> Sorry for my ignorance but how does the firmware loading work if not
>>>>>> with udev's help?
>>>>>
>>>>> I'm not sure exactly, but I think kernel VFS layer is capable to copy
>>>>> file data directly from mounted filesystem without user space helper.
>>>>
>>>> Here's the situation: request_firmware() waits 60 seconds for udev to do its
>>>> loading magic via a "usermode helper".  This delay is there to allow, for
>>>> example, userspace to unpack or download a new firmware image or verify the
>>>> firmware image *in userspace* before providing it to the driver to apply to the HW.
>>>>
>>>> Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
>>>> handshake on completion.
>>>>
>>>>>
>>>>>> As you can imagine, iwlwifi is suffering from the
>>>>>> same problem and I would be interested in applying the same change,
>>>>>> but I'd love to understand a bit more :)
>>>>>
>>>>> Yes, iwlwifi (and some other drivers) suffer from this. However this
>>>>> happen when the newest firmware version is not installed on the system
>>>>> and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
>>>>> it's not common.
>>>>
>>>> request_firmware_direct() was introduced at my request because (as you've
>>>> noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
>>>> periods of time when starting.  The bug that this introduced was a 60 second
>>>> delay per logical cpu when starting a system.  On a 64 cpu system that meant the
>>>> boot would complete in a little over one hour.
>>>>
>>>>>
>>>>> I started to see this currently, because that option was enabled on 
>>>>> RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
>>>>> happened because of that, i.e. thermal device was not functional
>>>>> because f/w wasn't loaded due to big delay.
>>>>>
>>>>> I'm not sure if replacing to request_firmware_direct() is a good
>>>>> fix though. For example I can see this problem also on brcmfmac, which
>>>>> use request_firmware_nowait(). I think I would rather prefer special
>>>>> helper for firmware drivers that needs user helper and have
>>>>> request_firmware() be direct as default.
>>>>>
>>>>
>>>> The difference between request_firmware_direct() and request_firmware() is that
>>>> the _direct() version does not wait the 60 seconds for udev interaction.  The
>>>> only userspace check performed is to see if the file is there, and if the file
>>>> does exist it is provided to the driver to be applied to the hardware.
>>>>
>>>> So the real question to ask here is whether or not the ath10k, brcmfmac, and
>>>> iwlwifi require udev to do anything beyond checking for the existence and
>>>> loading the firmware image.  If they don't, then it is better to use
>>>> request_firmware_direct().
>>>
>>> They don't need that, like 99% of the drivers I think, hence changing the
>>> default seems to be more reasonable. However changing 3 drivers would work
>>> for me as well, and that change do not introduce risk of broking drivers
>>> that require udev fw download.
>>>
>>> iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
>>> use request_firmware_nowait(), so it first need to be converted to
>>> ordinary request_firmware(), but this should be doable and I can do
>>> that.
>>
>> I am going bonkers here. This is the Nth time a discussion pops up on
>> firmware API usage. I stopped counting N :-( So the first issue was that
>> the INIT was taking to long as we were requesting firmware during probe
>> which was executed in the INIT context. So we added a worker and
>> register the driver from there. There was probably a reason for
>> switching to _no_wait() as well, but I do not recall the details. The
>> things is I don't know if I need user-space or not. I just need firmware
>> to get the device up and running. We have changed our driver a couple of
>> times now to accommodate something that in my opinion should have been
>> abstracted behind the firmware API in the first place and now here is
>> another proposal to change the drivers. Come on!
> 
> I understand you dislike that :-) Just to clarify the issue here:
> 
> Some drivers (including brcmfmac) request new firmware images, which are
> not yet available (i.e. development F/W versions) and then fall-back
> to older firmware version and works perfectly fine.
> 
> However with CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y configured, in case
> of missing F/W image, request firmware involve user space helper and
> waits 60s (loading_timeout value from drivers/base/firmware_class.c),
> what delays creating network interface and confuse users.
> 
> For brcmfmac this looks like this:
> 
> [   15.160923] brcmfmac 0000:03:00.0: Direct firmware load for brcm/brcmfmac4356-pcie.txt failed with error -2
> [   15.170759] brcmfmac 0000:03:00.0: Falling back to user helper
> <snip>
> [   75.709397] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Oct 22 2015 06:16:41 version 7.35.180.119 (r594535) FWID 01-1a5c4016
> [   75.736941] brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code (0x30 0x30)
> 
> Without CONFIG_FW_LOADER_USER_HELPER_FALLBACK first firmware request
> silently fail and then instantly next F/W image is loaded.
> 
> Another option to solve to problem would be stop requesting not
> available publicly firmware. However, I assume some drivers would
> like to preserve that option.

Actually, this is not the case with brcmfmac. We do need a firmware
file, ie. brcm/brcmfmac4356-pcie.bin, and also request for a nvram file,
ie. brcm/brcmfmac4356-pcie.txt. The latter is optional and the device
works fine without it.

What is still unclear to me is when request_firmware_direct() would fail
and in what circumstances the udev helper is a valid callback. Can you
explain such a scenario. Another question I have is what the reasons are
behind the 60 seconds timeout.

>>> However I wonder if changing that will not broke the case when
>>> driver is build-in in the kernel and f/w is not yet available when
>>> driver start to initialize. Or maybe nowadays this is not the case
>>> any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
>>> images are build-in in the kernel or copied to initramfs?
>>
>> That is a nice idea, but I have not seen any change in that area. Could
>> have missed it.
> 
> I believe this is how the things are already done, IOW switching to
> request_firmware_direct() in the driver should be no harm.

Ok. What are the consequences when:
- driver is built-in.
- driver+firmware present on initramfs.
- driver on initramfs, firmware only present on rootfs.
- driver+firmware only on rootfs.

I assume the third one would be considered a configuration issue.

Regards,
Arend
Prarit Bhargava July 22, 2016, 12:51 p.m. UTC | #10
On 07/22/2016 08:21 AM, Arend Van Spriel wrote:
> On 22-7-2016 12:26, Stanislaw Gruszka wrote:
>> On Fri, Jul 22, 2016 at 10:38:24AM +0200, Arend Van Spriel wrote:
>>> + Luis
>>>
>>> On 21-7-2016 13:51, Stanislaw Gruszka wrote:
>>>> (cc: firmware and brcmfmac maintainers)
>>>>
>>>> On Thu, Jul 21, 2016 at 06:23:11AM -0400, Prarit Bhargava wrote:
>>>>>
>>>>>
>>>>> On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
>>>>>> On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
>>>>>>> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
>>>>>>>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
>>>>>>>>> Firmware files are versioned to prevent older
>>>>>>>>> driver instances to load unsupported firmware
>>>>>>>>> blobs. This is reflected with a fallback logic
>>>>>>>>> which attempts to load several firmware files.
>>>>>>>>>
>>>>>>>>> This however produced a lot of unnecessary
>>>>>>>>> warnings sometimes confusing users and leading
>>>>>>>>> them to rename firmware files making things even
>>>>>>>>> more confusing.
>>>>>>>>
>>>>>>>> This happens on kernels configured with
>>>>>>>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
>>>>>>>> but also 60 seconds delay before loading next firmware version.
>>>>>>>> For some reason RHEL kernel needs above config option, so this
>>>>>>>> patch is very welcome from my perspective.
>>>>>>>>
>>>>>>>
>>>>>>> Sorry for my ignorance but how does the firmware loading work if not
>>>>>>> with udev's help?
>>>>>>
>>>>>> I'm not sure exactly, but I think kernel VFS layer is capable to copy
>>>>>> file data directly from mounted filesystem without user space helper.
>>>>>
>>>>> Here's the situation: request_firmware() waits 60 seconds for udev to do its
>>>>> loading magic via a "usermode helper".  This delay is there to allow, for
>>>>> example, userspace to unpack or download a new firmware image or verify the
>>>>> firmware image *in userspace* before providing it to the driver to apply to the HW.
>>>>>
>>>>> Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
>>>>> handshake on completion.
>>>>>
>>>>>>
>>>>>>> As you can imagine, iwlwifi is suffering from the
>>>>>>> same problem and I would be interested in applying the same change,
>>>>>>> but I'd love to understand a bit more :)
>>>>>>
>>>>>> Yes, iwlwifi (and some other drivers) suffer from this. However this
>>>>>> happen when the newest firmware version is not installed on the system
>>>>>> and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
>>>>>> it's not common.
>>>>>
>>>>> request_firmware_direct() was introduced at my request because (as you've
>>>>> noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
>>>>> periods of time when starting.  The bug that this introduced was a 60 second
>>>>> delay per logical cpu when starting a system.  On a 64 cpu system that meant the
>>>>> boot would complete in a little over one hour.
>>>>>
>>>>>>
>>>>>> I started to see this currently, because that option was enabled on 
>>>>>> RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
>>>>>> happened because of that, i.e. thermal device was not functional
>>>>>> because f/w wasn't loaded due to big delay.
>>>>>>
>>>>>> I'm not sure if replacing to request_firmware_direct() is a good
>>>>>> fix though. For example I can see this problem also on brcmfmac, which
>>>>>> use request_firmware_nowait(). I think I would rather prefer special
>>>>>> helper for firmware drivers that needs user helper and have
>>>>>> request_firmware() be direct as default.
>>>>>>
>>>>>
>>>>> The difference between request_firmware_direct() and request_firmware() is that
>>>>> the _direct() version does not wait the 60 seconds for udev interaction.  The
>>>>> only userspace check performed is to see if the file is there, and if the file
>>>>> does exist it is provided to the driver to be applied to the hardware.
>>>>>
>>>>> So the real question to ask here is whether or not the ath10k, brcmfmac, and
>>>>> iwlwifi require udev to do anything beyond checking for the existence and
>>>>> loading the firmware image.  If they don't, then it is better to use
>>>>> request_firmware_direct().
>>>>
>>>> They don't need that, like 99% of the drivers I think, hence changing the
>>>> default seems to be more reasonable. However changing 3 drivers would work
>>>> for me as well, and that change do not introduce risk of broking drivers
>>>> that require udev fw download.
>>>>
>>>> iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
>>>> use request_firmware_nowait(), so it first need to be converted to
>>>> ordinary request_firmware(), but this should be doable and I can do
>>>> that.
>>>
>>> I am going bonkers here. This is the Nth time a discussion pops up on
>>> firmware API usage. I stopped counting N :-( So the first issue was that
>>> the INIT was taking to long as we were requesting firmware during probe
>>> which was executed in the INIT context. So we added a worker and
>>> register the driver from there. There was probably a reason for
>>> switching to _no_wait() as well, but I do not recall the details. The
>>> things is I don't know if I need user-space or not. I just need firmware
>>> to get the device up and running. We have changed our driver a couple of
>>> times now to accommodate something that in my opinion should have been
>>> abstracted behind the firmware API in the first place and now here is
>>> another proposal to change the drivers. Come on!
>>
>> I understand you dislike that :-) Just to clarify the issue here:
>>
>> Some drivers (including brcmfmac) request new firmware images, which are
>> not yet available (i.e. development F/W versions) and then fall-back
>> to older firmware version and works perfectly fine.
>>
>> However with CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y configured, in case
>> of missing F/W image, request firmware involve user space helper and
>> waits 60s (loading_timeout value from drivers/base/firmware_class.c),
>> what delays creating network interface and confuse users.
>>
>> For brcmfmac this looks like this:
>>
>> [   15.160923] brcmfmac 0000:03:00.0: Direct firmware load for brcm/brcmfmac4356-pcie.txt failed with error -2
>> [   15.170759] brcmfmac 0000:03:00.0: Falling back to user helper
>> <snip>
>> [   75.709397] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Oct 22 2015 06:16:41 version 7.35.180.119 (r594535) FWID 01-1a5c4016
>> [   75.736941] brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code (0x30 0x30)
>>
>> Without CONFIG_FW_LOADER_USER_HELPER_FALLBACK first firmware request
>> silently fail and then instantly next F/W image is loaded.
>>
>> Another option to solve to problem would be stop requesting not
>> available publicly firmware. However, I assume some drivers would
>> like to preserve that option.
> 
> Actually, this is not the case with brcmfmac. We do need a firmware
> file, ie. brcm/brcmfmac4356-pcie.bin, and also request for a nvram file,
> ie. brcm/brcmfmac4356-pcie.txt. The latter is optional and the device
> works fine without it.
> 
> What is still unclear to me is when request_firmware_direct() would fail
> and in what circumstances the udev helper is a valid callback. Can you
> explain such a scenario. Another question I have is what the reasons are
> behind the 60 seconds timeout.

request_firmware_direct() will fail when the specified FW file is not present.
This is different from request_firmware() which implements a usermode helper to
potentially download firmware, or unpack a firmware image.

Re: 60 second timeout ... The 60 second timeout with request_firmware() is
completely arbitrary.  There is no way for udev to signal back to the kernel
that userspace helper has not completed its actions, so the kernel has a 60 dead
man timer-ish delay.

> 
>>>> However I wonder if changing that will not broke the case when
>>>> driver is build-in in the kernel and f/w is not yet available when
>>>> driver start to initialize. Or maybe nowadays this is not the case
>>>> any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
>>>> images are build-in in the kernel or copied to initramfs?
>>>
>>> That is a nice idea, but I have not seen any change in that area. Could
>>> have missed it.
>>
>> I believe this is how the things are already done, IOW switching to
>> request_firmware_direct() in the driver should be no harm.
> 
> Ok. What are the consequences when:
> - driver is built-in.
> - driver+firmware present on initramfs.
> - driver on initramfs, firmware only present on rootfs.
> - driver+firmware only on rootfs.
> 
> I assume the third one would be considered a configuration issue.

I think your question here can be answered by reading drivers/base/Kconfig:88,
and reading about those 4 config options.  I could paraphrase it butI think the
Kconfig notes are better than I could explain it.  Note that this is how things
currently work with request_firmware_nowait().  IIRC request_firmware_nowait()
is just an asynchronous version of request_firmware().

HTH,

P.
Luis Chamberlain July 22, 2016, 10:05 p.m. UTC | #11
On Fri, Jul 22, 2016 at 10:38:24AM +0200, Arend Van Spriel wrote:
> + Luis
> 
> On 21-7-2016 13:51, Stanislaw Gruszka wrote:
> > (cc: firmware and brcmfmac maintainers)
> > 
> > On Thu, Jul 21, 2016 at 06:23:11AM -0400, Prarit Bhargava wrote:
> >>
> >>
> >> On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
> >>> On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
> >>>> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
> >>>>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
> >>>>>> Firmware files are versioned to prevent older
> >>>>>> driver instances to load unsupported firmware
> >>>>>> blobs. This is reflected with a fallback logic
> >>>>>> which attempts to load several firmware files.
> >>>>>>
> >>>>>> This however produced a lot of unnecessary
> >>>>>> warnings sometimes confusing users and leading
> >>>>>> them to rename firmware files making things even
> >>>>>> more confusing.
> >>>>>
> >>>>> This happens on kernels configured with
> >>>>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
> >>>>> but also 60 seconds delay before loading next firmware version.
> >>>>> For some reason RHEL kernel needs above config option, so this
> >>>>> patch is very welcome from my perspective.
> >>>>>
> >>>>
> >>>> Sorry for my ignorance but how does the firmware loading work if not
> >>>> with udev's help?
> >>>
> >>> I'm not sure exactly, but I think kernel VFS layer is capable to copy
> >>> file data directly from mounted filesystem without user space helper.
> >>
> >> Here's the situation: request_firmware() waits 60 seconds for udev to do its
> >> loading magic via a "usermode helper".  This delay is there to allow, for
> >> example, userspace to unpack or download a new firmware image or verify the
> >> firmware image *in userspace* before providing it to the driver to apply to the HW.
> >>
> >> Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
> >> handshake on completion.
> >>
> >>>
> >>>> As you can imagine, iwlwifi is suffering from the
> >>>> same problem and I would be interested in applying the same change,
> >>>> but I'd love to understand a bit more :)
> >>>
> >>> Yes, iwlwifi (and some other drivers) suffer from this. However this
> >>> happen when the newest firmware version is not installed on the system
> >>> and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
> >>> it's not common.
> >>
> >> request_firmware_direct() was introduced at my request because (as you've
> >> noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
> >> periods of time when starting.  The bug that this introduced was a 60 second
> >> delay per logical cpu when starting a system.  On a 64 cpu system that meant the
> >> boot would complete in a little over one hour.
> >>
> >>>
> >>> I started to see this currently, because that option was enabled on 
> >>> RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
> >>> happened because of that, i.e. thermal device was not functional
> >>> because f/w wasn't loaded due to big delay.
> >>>
> >>> I'm not sure if replacing to request_firmware_direct() is a good
> >>> fix though. For example I can see this problem also on brcmfmac, which
> >>> use request_firmware_nowait(). I think I would rather prefer special
> >>> helper for firmware drivers that needs user helper and have
> >>> request_firmware() be direct as default.
> >>>
> >>
> >> The difference between request_firmware_direct() and request_firmware() is that
> >> the _direct() version does not wait the 60 seconds for udev interaction.  The
> >> only userspace check performed is to see if the file is there, and if the file
> >> does exist it is provided to the driver to be applied to the hardware.
> >>
> >> So the real question to ask here is whether or not the ath10k, brcmfmac, and
> >> iwlwifi require udev to do anything beyond checking for the existence and
> >> loading the firmware image.  If they don't, then it is better to use
> >> request_firmware_direct().
> > 
> > They don't need that, like 99% of the drivers I think, hence changing the
> > default seems to be more reasonable. However changing 3 drivers would work
> > for me as well, and that change do not introduce risk of broking drivers
> > that require udev fw download.
> > 
> > iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
> > use request_firmware_nowait(), so it first need to be converted to
> > ordinary request_firmware(), but this should be doable and I can do
> > that.
> 
> I am going bonkers here. This is the Nth time a discussion pops up on
> firmware API usage. I stopped counting N :-( So the first issue was that
> the INIT was taking to long as we were requesting firmware during probe
> which was executed in the INIT context. So we added a worker and
> register the driver from there. There was probably a reason for
> switching to _no_wait() as well, but I do not recall the details. The
> things is I don't know if I need user-space or not. I just need firmware
> to get the device up and running. We have changed our driver a couple of
> times now to accommodate something that in my opinion should have been
> abstracted behind the firmware API in the first place and now here is
> another proposal to change the drivers. Come on!

Its a big mess, but a lot of it has to do with the fact that none of the
issues have been well documented. Its also not clear what distros, driver
developers or users should do. I've tried helping with by providing such
documentation and also providing grammar rules to avoid further issues [0],
hopefully this series will be merged soon.

[0] https://marc.info/?l=linux-kernel&m=146611775314567

Using grammar rules the hunt with coccinelle as of today's linux-next
reveals there are only 2 explicit users of the usermode helper, and
I've vetted for these as well in the above patch series.

Now, other than this users will experience the usermode helper if the
distribution messed up and build their kernel with the fallback
usermode helper. If this was done on some old kernel the only way
to fix that is to fix that kernel build or change drivers to avoid
the usermode helper explicitly, unfortunately some API calls cannot
avoid it .... I've documented all this in the above series.

> > However I wonder if changing that will not broke the case when
> > driver is build-in in the kernel and f/w is not yet available when
> > driver start to initialize.

Indeed, tons of races are in theory possible here ;) technically since we use a
common API to read files directly now, a race might also be possible for other
users of the API on init as well. I have some grammar rules to test for this in
development, that is to vet that not only the firmware API is checked and we
avoid on init but also other callers that use the same read API.

> > Or maybe nowadays this is not the case
> > any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
> > images are build-in in the kernel or copied to initramfs?

The firmware API is a mess and I've been trying to correct that
with a more flexible API.

> That is a nice idea, but I have not seen any change in that area. Could
> have missed it.

Extensions to the fw API are IMHO best done through a newer flexible
API, feel free to refer to this development tree if you'd like to
contribute:

https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux-next.git/log/?h=20160616-sysdata-v2

  Luis
Luis Chamberlain July 22, 2016, 10:15 p.m. UTC | #12
On Fri, Jul 22, 2016 at 12:26:00PM +0200, Stanislaw Gruszka wrote:
> On Fri, Jul 22, 2016 at 10:38:24AM +0200, Arend Van Spriel wrote:
> > + Luis
> > 
> > On 21-7-2016 13:51, Stanislaw Gruszka wrote:
> > > (cc: firmware and brcmfmac maintainers)
> > > 
> > > On Thu, Jul 21, 2016 at 06:23:11AM -0400, Prarit Bhargava wrote:
> > >>
> > >>
> > >> On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
> > >>> On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
> > >>>> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
> > >>>>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
> > >>>>>> Firmware files are versioned to prevent older
> > >>>>>> driver instances to load unsupported firmware
> > >>>>>> blobs. This is reflected with a fallback logic
> > >>>>>> which attempts to load several firmware files.
> > >>>>>>
> > >>>>>> This however produced a lot of unnecessary
> > >>>>>> warnings sometimes confusing users and leading
> > >>>>>> them to rename firmware files making things even
> > >>>>>> more confusing.
> > >>>>>
> > >>>>> This happens on kernels configured with
> > >>>>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
> > >>>>> but also 60 seconds delay before loading next firmware version.
> > >>>>> For some reason RHEL kernel needs above config option, so this
> > >>>>> patch is very welcome from my perspective.
> > >>>>>
> > >>>>
> > >>>> Sorry for my ignorance but how does the firmware loading work if not
> > >>>> with udev's help?
> > >>>
> > >>> I'm not sure exactly, but I think kernel VFS layer is capable to copy
> > >>> file data directly from mounted filesystem without user space helper.
> > >>
> > >> Here's the situation: request_firmware() waits 60 seconds for udev to do its
> > >> loading magic via a "usermode helper".  This delay is there to allow, for
> > >> example, userspace to unpack or download a new firmware image or verify the
> > >> firmware image *in userspace* before providing it to the driver to apply to the HW.
> > >>
> > >> Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
> > >> handshake on completion.
> > >>
> > >>>
> > >>>> As you can imagine, iwlwifi is suffering from the
> > >>>> same problem and I would be interested in applying the same change,
> > >>>> but I'd love to understand a bit more :)
> > >>>
> > >>> Yes, iwlwifi (and some other drivers) suffer from this. However this
> > >>> happen when the newest firmware version is not installed on the system
> > >>> and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
> > >>> it's not common.
> > >>
> > >> request_firmware_direct() was introduced at my request because (as you've
> > >> noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
> > >> periods of time when starting.  The bug that this introduced was a 60 second
> > >> delay per logical cpu when starting a system.  On a 64 cpu system that meant the
> > >> boot would complete in a little over one hour.
> > >>
> > >>>
> > >>> I started to see this currently, because that option was enabled on 
> > >>> RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
> > >>> happened because of that, i.e. thermal device was not functional
> > >>> because f/w wasn't loaded due to big delay.
> > >>>
> > >>> I'm not sure if replacing to request_firmware_direct() is a good
> > >>> fix though. For example I can see this problem also on brcmfmac, which
> > >>> use request_firmware_nowait(). I think I would rather prefer special
> > >>> helper for firmware drivers that needs user helper and have
> > >>> request_firmware() be direct as default.
> > >>>
> > >>
> > >> The difference between request_firmware_direct() and request_firmware() is that
> > >> the _direct() version does not wait the 60 seconds for udev interaction.  The
> > >> only userspace check performed is to see if the file is there, and if the file
> > >> does exist it is provided to the driver to be applied to the hardware.
> > >>
> > >> So the real question to ask here is whether or not the ath10k, brcmfmac, and
> > >> iwlwifi require udev to do anything beyond checking for the existence and
> > >> loading the firmware image.  If they don't, then it is better to use
> > >> request_firmware_direct().
> > > 
> > > They don't need that, like 99% of the drivers I think, hence changing the
> > > default seems to be more reasonable. However changing 3 drivers would work
> > > for me as well, and that change do not introduce risk of broking drivers
> > > that require udev fw download.
> > > 
> > > iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
> > > use request_firmware_nowait(), so it first need to be converted to
> > > ordinary request_firmware(), but this should be doable and I can do
> > > that.
> > 
> > I am going bonkers here. This is the Nth time a discussion pops up on
> > firmware API usage. I stopped counting N :-( So the first issue was that
> > the INIT was taking to long as we were requesting firmware during probe
> > which was executed in the INIT context. So we added a worker and
> > register the driver from there. There was probably a reason for
> > switching to _no_wait() as well, but I do not recall the details. The
> > things is I don't know if I need user-space or not. I just need firmware
> > to get the device up and running. We have changed our driver a couple of
> > times now to accommodate something that in my opinion should have been
> > abstracted behind the firmware API in the first place and now here is
> > another proposal to change the drivers. Come on!
> 
> I understand you dislike that :-) Just to clarify the issue here:
> 
> Some drivers (including brcmfmac) request new firmware images, which are
> not yet available (i.e. development F/W versions) and then fall-back
> to older firmware version and works perfectly fine.

The right way to address this of course is to extend the FW API
so that a batch of firmware files can be hunted for and one can
easily annotate as a driver developer which are optional and which are
not. The API then will do everything for you.

I was considering this as a future extension to the firmware API
through the new extensible firmware API, the sysdata API. I'd prefer
to add that as a secondary step, but if someone wants to add support
for it now feel free to send me some patches against my tree.

> However with CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y configured, in case
> of missing F/W image, request firmware involve user space helper and
> waits 60s (loading_timeout value from drivers/base/firmware_class.c),
> what delays creating network interface and confuse users.

Correct, this has been a holy mess. One way to fix this on userspace
is to upgrade systemd with an exception to the udev kmod helper,
and avoid the timeout completely for kmod helper which loads drivers.

CONFIG_FW_LOADER_USER_HELPER_FALLBACK should be avoided.

> For brcmfmac this looks like this:
> 
> [   15.160923] brcmfmac 0000:03:00.0: Direct firmware load for brcm/brcmfmac4356-pcie.txt failed with error -2
> [   15.170759] brcmfmac 0000:03:00.0: Falling back to user helper
> <snip>
> [   75.709397] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Oct 22 2015 06:16:41 version 7.35.180.119 (r594535) FWID 01-1a5c4016
> [   75.736941] brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code (0x30 0x30)
> 
> Without CONFIG_FW_LOADER_USER_HELPER_FALLBACK first firmware request
> silently fail and then instantly next F/W image is loaded.

Yup.

> Another option to solve to problem would be stop requesting not
> available publicly firmware. However, I assume some drivers would
> like to preserve that option.

No, this seems counter productive given that the firmware is expected
to land eventually.

> > > However I wonder if changing that will not broke the case when
> > > driver is build-in in the kernel and f/w is not yet available when
> > > driver start to initialize. Or maybe nowadays this is not the case
> > > any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
> > > images are build-in in the kernel or copied to initramfs?
> > 
> > That is a nice idea, but I have not seen any change in that area. Could
> > have missed it.
> 
> I believe this is how the things are already done, IOW switching to
> request_firmware_direct() in the driver should be no harm.

We don't stuff firmware as built-in if the driver used MODULE_FIRMWARE()
and the driver is built-in. What commit did that?

The right thing to do is distros should avoid
CONFIG_FW_LOADER_USER_HELPER_FALLBACK and if they are stuck with it, a
systemd work around is possible. Upstream systemd already increased
the timeout to 180 seconds. Upstream also added a udevd --event-timeout
command line option, look into that. Other distros (OpenSUSE) avoids
the kmod timeout ;)

The timeout thing was simply a mistake, specially for kmod and it wasn't
well thought out. I've listed more serious implications for it here:

http://www.do-not-panic.com/2015/12/linux-asynchronous-probe.html

  Luis
Luis Chamberlain July 22, 2016, 10:19 p.m. UTC | #13
On Fri, Jul 22, 2016 at 08:51:36AM -0400, Prarit Bhargava wrote:
> On 07/22/2016 08:21 AM, Arend Van Spriel wrote:
> >> Another option to solve to problem would be stop requesting not
> >> available publicly firmware. However, I assume some drivers would
> >> like to preserve that option.
> > 
> > Actually, this is not the case with brcmfmac. We do need a firmware
> > file, ie. brcm/brcmfmac4356-pcie.bin, and also request for a nvram file,
> > ie. brcm/brcmfmac4356-pcie.txt. The latter is optional and the device
> > works fine without it.
> > 
> > What is still unclear to me is when request_firmware_direct() would fail
> > and in what circumstances the udev helper is a valid callback. Can you
> > explain such a scenario. Another question I have is what the reasons are
> > behind the 60 seconds timeout.
> 
> request_firmware_direct() will fail when the specified FW file is not present.
> This is different from request_firmware() which implements a usermode helper to
> potentially download firmware, or unpack a firmware image.
> 
> Re: 60 second timeout ... The 60 second timeout with request_firmware() is
> completely arbitrary.  There is no way for udev to signal back to the kernel
> that userspace helper has not completed its actions, so the kernel has a 60 dead
> man timer-ish delay.

Lets call it what it was: the 60 second timeout thing was simply a mistake.
Its no longer 60 seconds anyway, and in fact its accepted a dreaded issue.
What *we* should be doing is thinking about proper long term architecture now.
Async probe was one solution to some issues, a new flexible firmware API
that avoids the usermode helper 100% is another.

Distros stuck with the fallback option should review their strategies,
either disabling the fallback option, upgrade systemd, or use alternative
solutions (opensuse has a good one).

> >>>> However I wonder if changing that will not broke the case when
> >>>> driver is build-in in the kernel and f/w is not yet available when
> >>>> driver start to initialize. Or maybe nowadays this is not the case
> >>>> any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
> >>>> images are build-in in the kernel or copied to initramfs?
> >>>
> >>> That is a nice idea, but I have not seen any change in that area. Could
> >>> have missed it.
> >>
> >> I believe this is how the things are already done, IOW switching to
> >> request_firmware_direct() in the driver should be no harm.
> > 
> > Ok. What are the consequences when:
> > - driver is built-in.
> > - driver+firmware present on initramfs.
> > - driver on initramfs, firmware only present on rootfs.
> > - driver+firmware only on rootfs.
> > 
> > I assume the third one would be considered a configuration issue.
> 
> I think your question here can be answered by reading drivers/base/Kconfig:88,
> and reading about those 4 config options.

No, this documentation is terrible, I've posted some patches to help with this
mess.

> I could paraphrase it butI think the
> Kconfig notes are better than I could explain it.  Note that this is how things
> currently work with request_firmware_nowait().  IIRC request_firmware_nowait()
> is just an asynchronous version of request_firmware().

... its a mess.

  Luis
Emmanuel Grumbach July 25, 2016, 7:51 a.m. UTC | #14
On Sat, Jul 23, 2016 at 1:19 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Fri, Jul 22, 2016 at 08:51:36AM -0400, Prarit Bhargava wrote:
>> On 07/22/2016 08:21 AM, Arend Van Spriel wrote:
>> >> Another option to solve to problem would be stop requesting not
>> >> available publicly firmware. However, I assume some drivers would
>> >> like to preserve that option.
>> >
>> > Actually, this is not the case with brcmfmac. We do need a firmware
>> > file, ie. brcm/brcmfmac4356-pcie.bin, and also request for a nvram file,
>> > ie. brcm/brcmfmac4356-pcie.txt. The latter is optional and the device
>> > works fine without it.
>> >
>> > What is still unclear to me is when request_firmware_direct() would fail
>> > and in what circumstances the udev helper is a valid callback. Can you
>> > explain such a scenario. Another question I have is what the reasons are
>> > behind the 60 seconds timeout.
>>
>> request_firmware_direct() will fail when the specified FW file is not present.
>> This is different from request_firmware() which implements a usermode helper to
>> potentially download firmware, or unpack a firmware image.
>>
>> Re: 60 second timeout ... The 60 second timeout with request_firmware() is
>> completely arbitrary.  There is no way for udev to signal back to the kernel
>> that userspace helper has not completed its actions, so the kernel has a 60 dead
>> man timer-ish delay.
>
> Lets call it what it was: the 60 second timeout thing was simply a mistake.
> Its no longer 60 seconds anyway, and in fact its accepted a dreaded issue.
> What *we* should be doing is thinking about proper long term architecture now.
> Async probe was one solution to some issues, a new flexible firmware API
> that avoids the usermode helper 100% is another.
>
> Distros stuck with the fallback option should review their strategies,
> either disabling the fallback option, upgrade systemd, or use alternative
> solutions (opensuse has a good one).
>
>> >>>> However I wonder if changing that will not broke the case when
>> >>>> driver is build-in in the kernel and f/w is not yet available when
>> >>>> driver start to initialize. Or maybe nowadays this is not the case
>> >>>> any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w
>> >>>> images are build-in in the kernel or copied to initramfs?
>> >>>
>> >>> That is a nice idea, but I have not seen any change in that area. Could
>> >>> have missed it.
>> >>
>> >> I believe this is how the things are already done, IOW switching to
>> >> request_firmware_direct() in the driver should be no harm.
>> >
>> > Ok. What are the consequences when:
>> > - driver is built-in.
>> > - driver+firmware present on initramfs.
>> > - driver on initramfs, firmware only present on rootfs.
>> > - driver+firmware only on rootfs.
>> >
>> > I assume the third one would be considered a configuration issue.
>>
>> I think your question here can be answered by reading drivers/base/Kconfig:88,
>> and reading about those 4 config options.
>
> No, this documentation is terrible, I've posted some patches to help with this
> mess.
>
>> I could paraphrase it butI think the
>> Kconfig notes are better than I could explain it.  Note that this is how things
>> currently work with request_firmware_nowait().  IIRC request_firmware_nowait()
>> is just an asynchronous version of request_firmware().
>
> ... its a mess.
>

Awesome. I leave the code as is and ignore any RHEL bugs that are
related to that. If someone wants to improve all this, I'd be thankful
if he could do the work in the subsystem as Arend suggested.
Arend van Spriel July 28, 2016, 7:23 p.m. UTC | #15
On 23-07-16 00:05, Luis R. Rodriguez wrote:
> On Fri, Jul 22, 2016 at 10:38:24AM +0200, Arend Van Spriel wrote:
>> + Luis
>>
>> On 21-7-2016 13:51, Stanislaw Gruszka wrote:
>>> (cc: firmware and brcmfmac maintainers)
>>>
>>> On Thu, Jul 21, 2016 at 06:23:11AM -0400, Prarit Bhargava wrote:
>>>>
>>>>
>>>> On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
>>>>> On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
>>>>>> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
>>>>>>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
>>>>>>>> Firmware files are versioned to prevent older
>>>>>>>> driver instances to load unsupported firmware
>>>>>>>> blobs. This is reflected with a fallback logic
>>>>>>>> which attempts to load several firmware files.
>>>>>>>>
>>>>>>>> This however produced a lot of unnecessary
>>>>>>>> warnings sometimes confusing users and leading
>>>>>>>> them to rename firmware files making things even
>>>>>>>> more confusing.
>>>>>>>
>>>>>>> This happens on kernels configured with
>>>>>>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
>>>>>>> but also 60 seconds delay before loading next firmware version.
>>>>>>> For some reason RHEL kernel needs above config option, so this
>>>>>>> patch is very welcome from my perspective.
>>>>>>>
>>>>>>
>>>>>> Sorry for my ignorance but how does the firmware loading work if not
>>>>>> with udev's help?
>>>>>
>>>>> I'm not sure exactly, but I think kernel VFS layer is capable to copy
>>>>> file data directly from mounted filesystem without user space helper.
>>>>
>>>> Here's the situation: request_firmware() waits 60 seconds for udev to do its
>>>> loading magic via a "usermode helper".  This delay is there to allow, for
>>>> example, userspace to unpack or download a new firmware image or verify the
>>>> firmware image *in userspace* before providing it to the driver to apply to the HW.
>>>>
>>>> Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
>>>> handshake on completion.
>>>>
>>>>>
>>>>>> As you can imagine, iwlwifi is suffering from the
>>>>>> same problem and I would be interested in applying the same change,
>>>>>> but I'd love to understand a bit more :)
>>>>>
>>>>> Yes, iwlwifi (and some other drivers) suffer from this. However this
>>>>> happen when the newest firmware version is not installed on the system
>>>>> and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
>>>>> it's not common.
>>>>
>>>> request_firmware_direct() was introduced at my request because (as you've
>>>> noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
>>>> periods of time when starting.  The bug that this introduced was a 60 second
>>>> delay per logical cpu when starting a system.  On a 64 cpu system that meant the
>>>> boot would complete in a little over one hour.
>>>>
>>>>>
>>>>> I started to see this currently, because that option was enabled on 
>>>>> RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
>>>>> happened because of that, i.e. thermal device was not functional
>>>>> because f/w wasn't loaded due to big delay.
>>>>>
>>>>> I'm not sure if replacing to request_firmware_direct() is a good
>>>>> fix though. For example I can see this problem also on brcmfmac, which
>>>>> use request_firmware_nowait(). I think I would rather prefer special
>>>>> helper for firmware drivers that needs user helper and have
>>>>> request_firmware() be direct as default.
>>>>>
>>>>
>>>> The difference between request_firmware_direct() and request_firmware() is that
>>>> the _direct() version does not wait the 60 seconds for udev interaction.  The
>>>> only userspace check performed is to see if the file is there, and if the file
>>>> does exist it is provided to the driver to be applied to the hardware.
>>>>
>>>> So the real question to ask here is whether or not the ath10k, brcmfmac, and
>>>> iwlwifi require udev to do anything beyond checking for the existence and
>>>> loading the firmware image.  If they don't, then it is better to use
>>>> request_firmware_direct().
>>>
>>> They don't need that, like 99% of the drivers I think, hence changing the
>>> default seems to be more reasonable. However changing 3 drivers would work
>>> for me as well, and that change do not introduce risk of broking drivers
>>> that require udev fw download.
>>>
>>> iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
>>> use request_firmware_nowait(), so it first need to be converted to
>>> ordinary request_firmware(), but this should be doable and I can do
>>> that.
>>
>> I am going bonkers here. This is the Nth time a discussion pops up on
>> firmware API usage. I stopped counting N :-( So the first issue was that
>> the INIT was taking to long as we were requesting firmware during probe
>> which was executed in the INIT context. So we added a worker and
>> register the driver from there. There was probably a reason for
>> switching to _no_wait() as well, but I do not recall the details. The
>> things is I don't know if I need user-space or not. I just need firmware
>> to get the device up and running. We have changed our driver a couple of
>> times now to accommodate something that in my opinion should have been
>> abstracted behind the firmware API in the first place and now here is
>> another proposal to change the drivers. Come on!
> 
> Its a big mess, but a lot of it has to do with the fact that none of the
> issues have been well documented. Its also not clear what distros, driver
> developers or users should do. I've tried helping with by providing such
> documentation and also providing grammar rules to avoid further issues [0],
> hopefully this series will be merged soon.
> 
> [0] https://marc.info/?l=linux-kernel&m=146611775314567
> 
> Using grammar rules the hunt with coccinelle as of today's linux-next
> reveals there are only 2 explicit users of the usermode helper, and
> I've vetted for these as well in the above patch series.
> 
> Now, other than this users will experience the usermode helper if the
> distribution messed up and build their kernel with the fallback
> usermode helper. If this was done on some old kernel the only way
> to fix that is to fix that kernel build or change drivers to avoid
> the usermode helper explicitly, unfortunately some API calls cannot
> avoid it .... I've documented all this in the above series.
> 
>>> However I wonder if changing that will not broke the case when
>>> driver is build-in in the kernel and f/w is not yet available when
>>> driver start to initialize.
> 
> Indeed, tons of races are in theory possible here ;) technically since we use a
> common API to read files directly now, a race might also be possible for other
> users of the API on init as well. I have some grammar rules to test for this in
> development, that is to vet that not only the firmware API is checked and we
> avoid on init but also other callers that use the same read API.
> 
>>> Or maybe nowadays this is not the case
>>> any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
>>> images are build-in in the kernel or copied to initramfs?
> 
> The firmware API is a mess and I've been trying to correct that
> with a more flexible API.
> 
>> That is a nice idea, but I have not seen any change in that area. Could
>> have missed it.
> 
> Extensions to the fw API are IMHO best done through a newer flexible
> API, feel free to refer to this development tree if you'd like to
> contribute:
> 
> https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux-next.git/log/?h=20160616-sysdata-v2

So I had a look and noticed commit c8df68e83392 ("firmware: annotate
thou shalt not request fw on init or probe"). Now this conflicts with
our wireless driver. The original suggestion a long, long time ago was
to use IFF_UP as trigger to go and request firmware. However, for that
we would need to register a netdevice during probe, and consequently we
should also have a wiphy instance registered. However, that has all kind
of feature flags for which we need firmware running on the device to
query what is supported and what not. I can make a fair bet that
brcmfmac is not the only driver with such a requirement. So how can we
crack that nut.

Regards,
Arend
Arend van Spriel July 28, 2016, 7:23 p.m. UTC | #16
On 23-07-16 00:15, Luis R. Rodriguez wrote:
> On Fri, Jul 22, 2016 at 12:26:00PM +0200, Stanislaw Gruszka wrote:
>> On Fri, Jul 22, 2016 at 10:38:24AM +0200, Arend Van Spriel wrote:
>>> + Luis
>>>
>>> On 21-7-2016 13:51, Stanislaw Gruszka wrote:
>>>> (cc: firmware and brcmfmac maintainers)
>>>>
>>>> On Thu, Jul 21, 2016 at 06:23:11AM -0400, Prarit Bhargava wrote:
>>>>>
>>>>>
>>>>> On 07/21/2016 04:05 AM, Stanislaw Gruszka wrote:
>>>>>> On Thu, Jul 21, 2016 at 10:36:42AM +0300, Emmanuel Grumbach wrote:
>>>>>>> On Thu, Jul 21, 2016 at 10:09 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote:
>>>>>>>> On Tue, Jul 19, 2016 at 03:00:37PM +0200, Michal Kazior wrote:
>>>>>>>>> Firmware files are versioned to prevent older
>>>>>>>>> driver instances to load unsupported firmware
>>>>>>>>> blobs. This is reflected with a fallback logic
>>>>>>>>> which attempts to load several firmware files.
>>>>>>>>>
>>>>>>>>> This however produced a lot of unnecessary
>>>>>>>>> warnings sometimes confusing users and leading
>>>>>>>>> them to rename firmware files making things even
>>>>>>>>> more confusing.
>>>>>>>>
>>>>>>>> This happens on kernels configured with
>>>>>>>> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and cause not only ugly warnings,
>>>>>>>> but also 60 seconds delay before loading next firmware version.
>>>>>>>> For some reason RHEL kernel needs above config option, so this
>>>>>>>> patch is very welcome from my perspective.
>>>>>>>>
>>>>>>>
>>>>>>> Sorry for my ignorance but how does the firmware loading work if not
>>>>>>> with udev's help?
>>>>>>
>>>>>> I'm not sure exactly, but I think kernel VFS layer is capable to copy
>>>>>> file data directly from mounted filesystem without user space helper.
>>>>>
>>>>> Here's the situation: request_firmware() waits 60 seconds for udev to do its
>>>>> loading magic via a "usermode helper".  This delay is there to allow, for
>>>>> example, userspace to unpack or download a new firmware image or verify the
>>>>> firmware image *in userspace* before providing it to the driver to apply to the HW.
>>>>>
>>>>> Why 60 seconds?  It is arbitrary and there is no way for udev & the kernel to
>>>>> handshake on completion.
>>>>>
>>>>>>
>>>>>>> As you can imagine, iwlwifi is suffering from the
>>>>>>> same problem and I would be interested in applying the same change,
>>>>>>> but I'd love to understand a bit more :)
>>>>>>
>>>>>> Yes, iwlwifi (and some other drivers) suffer from this. However this
>>>>>> happen when the newest firmware version is not installed on the system
>>>>>> and CONFIG_FW_LOADER_USER_HELPER_FALLBACK is enabled. What I suppose
>>>>>> it's not common.
>>>>>
>>>>> request_firmware_direct() was introduced at my request because (as you've
>>>>> noticed) when CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y drivers may stall for long
>>>>> periods of time when starting.  The bug that this introduced was a 60 second
>>>>> delay per logical cpu when starting a system.  On a 64 cpu system that meant the
>>>>> boot would complete in a little over one hour.
>>>>>
>>>>>>
>>>>>> I started to see this currently, because that option was enabled on 
>>>>>> RHEL kernel. BTW: I think Prarit iwlwifi thermal_zone problem was
>>>>>> happened because of that, i.e. thermal device was not functional
>>>>>> because f/w wasn't loaded due to big delay.
>>>>>>
>>>>>> I'm not sure if replacing to request_firmware_direct() is a good
>>>>>> fix though. For example I can see this problem also on brcmfmac, which
>>>>>> use request_firmware_nowait(). I think I would rather prefer special
>>>>>> helper for firmware drivers that needs user helper and have
>>>>>> request_firmware() be direct as default.
>>>>>>
>>>>>
>>>>> The difference between request_firmware_direct() and request_firmware() is that
>>>>> the _direct() version does not wait the 60 seconds for udev interaction.  The
>>>>> only userspace check performed is to see if the file is there, and if the file
>>>>> does exist it is provided to the driver to be applied to the hardware.
>>>>>
>>>>> So the real question to ask here is whether or not the ath10k, brcmfmac, and
>>>>> iwlwifi require udev to do anything beyond checking for the existence and
>>>>> loading the firmware image.  If they don't, then it is better to use
>>>>> request_firmware_direct().
>>>>
>>>> They don't need that, like 99% of the drivers I think, hence changing the
>>>> default seems to be more reasonable. However changing 3 drivers would work
>>>> for me as well, and that change do not introduce risk of broking drivers
>>>> that require udev fw download.
>>>>
>>>> iwlwifi and ath10k are trivial, bcrmfmac is a bit more complex as it
>>>> use request_firmware_nowait(), so it first need to be converted to
>>>> ordinary request_firmware(), but this should be doable and I can do
>>>> that.
>>>
>>> I am going bonkers here. This is the Nth time a discussion pops up on
>>> firmware API usage. I stopped counting N :-( So the first issue was that
>>> the INIT was taking to long as we were requesting firmware during probe
>>> which was executed in the INIT context. So we added a worker and
>>> register the driver from there. There was probably a reason for
>>> switching to _no_wait() as well, but I do not recall the details. The
>>> things is I don't know if I need user-space or not. I just need firmware
>>> to get the device up and running. We have changed our driver a couple of
>>> times now to accommodate something that in my opinion should have been
>>> abstracted behind the firmware API in the first place and now here is
>>> another proposal to change the drivers. Come on!
>>
>> I understand you dislike that :-) Just to clarify the issue here:
>>
>> Some drivers (including brcmfmac) request new firmware images, which are
>> not yet available (i.e. development F/W versions) and then fall-back
>> to older firmware version and works perfectly fine.
> 
> The right way to address this of course is to extend the FW API
> so that a batch of firmware files can be hunted for and one can
> easily annotate as a driver developer which are optional and which are
> not. The API then will do everything for you.

That is more or less how I abstracted it in brcmfmac. Our batch only
consists of max two files with binary executable image and rf
calibration data, which may or may not be optional. So we do not have
firmware fallback sequence.

For iwlwifi and ath10k I guess the use-case is that the batch is an
ordered list from newest supported firmware to oldest supported firmware.

> I was considering this as a future extension to the firmware API
> through the new extensible firmware API, the sysdata API. I'd prefer
> to add that as a secondary step, but if someone wants to add support
> for it now feel free to send me some patches against my tree.
> 
>> However with CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y configured, in case
>> of missing F/W image, request firmware involve user space helper and
>> waits 60s (loading_timeout value from drivers/base/firmware_class.c),
>> what delays creating network interface and confuse users.
> 
> Correct, this has been a holy mess. One way to fix this on userspace
> is to upgrade systemd with an exception to the udev kmod helper,
> and avoid the timeout completely for kmod helper which loads drivers.
> 
> CONFIG_FW_LOADER_USER_HELPER_FALLBACK should be avoided.
> 
>> For brcmfmac this looks like this:
>>
>> [   15.160923] brcmfmac 0000:03:00.0: Direct firmware load for brcm/brcmfmac4356-pcie.txt failed with error -2
>> [   15.170759] brcmfmac 0000:03:00.0: Falling back to user helper
>> <snip>
>> [   75.709397] brcmfmac: brcmf_c_preinit_dcmds: Firmware version = wl0: Oct 22 2015 06:16:41 version 7.35.180.119 (r594535) FWID 01-1a5c4016
>> [   75.736941] brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code (0x30 0x30)
>>
>> Without CONFIG_FW_LOADER_USER_HELPER_FALLBACK first firmware request
>> silently fail and then instantly next F/W image is loaded.
> 
> Yup.
> 
>> Another option to solve to problem would be stop requesting not
>> available publicly firmware. However, I assume some drivers would
>> like to preserve that option.
> 
> No, this seems counter productive given that the firmware is expected
> to land eventually.
> 
>>>> However I wonder if changing that will not broke the case when
>>>> driver is build-in in the kernel and f/w is not yet available when
>>>> driver start to initialize. Or maybe nowadays this is not the case
>>>> any longer, i.e. the MODULE_FIRMWARE macros assure proper f/w 
>>>> images are build-in in the kernel or copied to initramfs?
>>>
>>> That is a nice idea, but I have not seen any change in that area. Could
>>> have missed it.
>>
>> I believe this is how the things are already done, IOW switching to
>> request_firmware_direct() in the driver should be no harm.
> 
> We don't stuff firmware as built-in if the driver used MODULE_FIRMWARE()
> and the driver is built-in. What commit did that?
> 
> The right thing to do is distros should avoid
> CONFIG_FW_LOADER_USER_HELPER_FALLBACK and if they are stuck with it, a
> systemd work around is possible. Upstream systemd already increased
> the timeout to 180 seconds. Upstream also added a udevd --event-timeout
> command line option, look into that. Other distros (OpenSUSE) avoids
> the kmod timeout ;)
> 
> The timeout thing was simply a mistake, specially for kmod and it wasn't
> well thought out. I've listed more serious implications for it here:
> 
> http://www.do-not-panic.com/2015/12/linux-asynchronous-probe.html

Thanks. Will read that.

Regards,
Arend
Luis Chamberlain July 28, 2016, 11:28 p.m. UTC | #17
On Thu, Jul 28, 2016 at 09:23:35PM +0200, Arend van Spriel wrote:
> On 23-07-16 00:05, Luis R. Rodriguez wrote:
> > The firmware API is a mess and I've been trying to correct that
> > with a more flexible API.

<-- snip -->

> > Extensions to the fw API are IMHO best done through a newer flexible
> > API, feel free to refer to this development tree if you'd like to
> > contribute:
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux-next.git/log/?h=20160616-sysdata-v2
> 
> So I had a look and noticed commit c8df68e83392 ("firmware: annotate
> thou shalt not request fw on init or probe"). Now this conflicts with
> our wireless driver. The original suggestion a long, long time ago was
> to use IFF_UP as trigger to go and request firmware. However, for that
> we would need to register a netdevice during probe, and consequently we
> should also have a wiphy instance registered. However, that has all kind
> of feature flags for which we need firmware running on the device to
> query what is supported and what not. I can make a fair bet that
> brcmfmac is not the only driver with such a requirement. So how can we
> crack that nut.

Glad you asked.

Despite my latest set of updates on documentation for the firmware_class [0] (not
yet merged), and it being based on the latest discussed and agreed upon we
really have nothing well ironed out for what you describe, so let's try
to figure that out now.

To be clear, folks had originally said using the firmware API on *init* was
dumb, and some of this came up because of the infamous systemd udev timeout.
For completeness, I've documented some of the history of this issue
in great detail [1], mostly because I had to deal with a bug at SUSE about
this, and find a proper solution. Avoiding re-iterating *exactly why* the
timeout for kmod was ill-founded, and assuming you all now buy that,
the summary of facts of the *why* it turns out to be a bad idea to use the
firmware API on init *or* probe:

  a) by default the driver model actually calls both init and probe serially and
     synchronously
  b) we have no deterministic way of being certain we have whatever filesystem
     the driver needed as ready, the firmware may live in initramfs, or may only
     be available later on the real filesystem, and even after that the system
     may be designed to pivot_root.

In terms of solutions, lets discuss, here are a few options I can think of:

  1) Because of b) if you know you are going to use the firmware API you should
     just stuff firmware that is needed on init or probe as built-in
     (CONFIG_EXTRA_FIRMWARE) or stuff it into initramfs. This approach is generally
     accepted, however there is no systematic approach to ensure this is done. Now
     that we have coccinelle grammar to find these drivers it should be relatively
     easy to identify them and require the firmware as part of the distribution's
     initramfs or peg them as part of CONFIG_EXTRA_FIRMWARE if a distro prefers that.

     The only issue with this approach is its left up to the distribution to resolve.

  2) If the driver *knows* that probe may take a while it can request the driver core
     to probe the driver asynchronously, it can do so by using:

static struct pci_driver foo_pci_driver = {
      ...
      .driver.probe_type = PROBE_PREFER_ASYNCHRONOUS,
};

    This would not really resolve the deterministic issues listed in b) and for
    this reason this is not really a firmware-API-on-probe solution -- its an
    annotation to help avoid delays boot due to knowing probe may take a while.

  3) Userspace that loads modules can / should pass the async probe generic
     module parameter "async_probe" (for instance 'modprobe ath10k async_probe')
     when loading all modules. This should already be relatively
     safe when using on modules. This is what systemd long ago assumed was
     being done anyway. Again, also this is not really a firmware-API-on-probe solution,
     it can however be used by systemd for instance to help avoid delays on
     boot due to module lengthy probe calls

As it stands we have no resolution for the deterministic existential issues of
the firmware but in practice 1-3 should help. 2-3 should probably be done
regardless. I've been meaning to send patches for 3) but haven't had time.

As far as the deterministic existential firmware issue... Since we're just
reading files from the filesystem now (there are exceptions to this, but my
goal is to corner-case this code with the sysdata API), if we really wanted
something rock solid for these drivers I suppose we could consider implementing
an event driven probe if a driver requests for async probe. For instance, if
async probe was requested and the driver has MODULE_FIRMWARE(firmware_name)
we could add a notifier to probe the driver once the firmware is accessible.

For all intents and purposes though -- although I know the real issue here
the deterministic way of knowing when firmware is available, in practice
annotating your driver with PROBE_PREFER_ASYNCHRONOUS should solve most
races. This would not be a resolution, but rather an annotation to the fact
your driver probe does take a while -- and should make most folks happy until
we have a proper deterministic firmware solution, I think.

I welcome other ideas.

[0] https://lkml.kernel.org/r/1466117661-22075-1-git-send-email-mcgrof@kernel.org
[1] http://www.do-not-panic.com/2015/12/linux-asynchronous-probe.html

  Luis
Kalle Valo Aug. 2, 2016, 11:10 a.m. UTC | #18
"Luis R. Rodriguez" <mcgrof@kernel.org> writes:

> I was considering this as a future extension to the firmware API
> through the new extensible firmware API, the sysdata API.

I think Linus mentioned this already, but I want to reiterate anyway.
The name "sysdata" is horrible, I didn't have any idea what it means
until I read your description. Please continue to use the term
"firmware", anyone already know what it means.
Kalle Valo Aug. 2, 2016, 11:18 a.m. UTC | #19
Michal Kazior <michal.kazior@tieto.com> writes:

> Firmware files are versioned to prevent older
> driver instances to load unsupported firmware
> blobs. This is reflected with a fallback logic
> which attempts to load several firmware files.
>
> This however produced a lot of unnecessary
> warnings sometimes confusing users and leading
> them to rename firmware files making things even
> more confusing.
>
> Hence use request_firmware_direct() which does not
> produce extra warnings. This shouldn't really
> break anything because most modern systems don't
> rely on udev/hotplug helpers to load firmware
> files anymore.
>
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>

Nice. These "firmware not found" messages have been confusing ath10k
users for ages and should be properly fixed. I hope we find a solution.

But I talked with Felix about this and he made a good point about board
and calibration files. Calibration files might be created runtime, for
example retrieved from NAND etc, and this might break the use case when
ath10k is statically linked to kernel. Is the combination used in real
life and should we care, that I do not know, but I'm worried of possible
regressions. I guess LEDE/openwrt always loads ath10k as a module and
after the calibration file is created?
Felix Fietkau Aug. 2, 2016, 11:24 a.m. UTC | #20
On 2016-08-02 13:18, Valo, Kalle wrote:
> Michal Kazior <michal.kazior@tieto.com> writes:
> 
>> Firmware files are versioned to prevent older
>> driver instances to load unsupported firmware
>> blobs. This is reflected with a fallback logic
>> which attempts to load several firmware files.
>>
>> This however produced a lot of unnecessary
>> warnings sometimes confusing users and leading
>> them to rename firmware files making things even
>> more confusing.
>>
>> Hence use request_firmware_direct() which does not
>> produce extra warnings. This shouldn't really
>> break anything because most modern systems don't
>> rely on udev/hotplug helpers to load firmware
>> files anymore.
>>
>> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
> 
> Nice. These "firmware not found" messages have been confusing ath10k
> users for ages and should be properly fixed. I hope we find a solution.
> 
> But I talked with Felix about this and he made a good point about board
> and calibration files. Calibration files might be created runtime, for
> example retrieved from NAND etc, and this might break the use case when
> ath10k is statically linked to kernel. Is the combination used in real
> life and should we care, that I do not know, but I'm worried of possible
> regressions. I guess LEDE/openwrt always loads ath10k as a module and
> after the calibration file is created?
ath10k is always loaded as a module, and the calibration file is created
by a script that's triggered by the firmware uevent.

- Felix
Luis Chamberlain Aug. 2, 2016, 2:16 p.m. UTC | #21
On Tue, Aug 02, 2016 at 11:10:22AM +0000, Valo, Kalle wrote:
> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> 
> > I was considering this as a future extension to the firmware API
> > through the new extensible firmware API, the sysdata API.
> 
> I think Linus mentioned this already, but I want to reiterate anyway.
> The name "sysdata" is horrible, I didn't have any idea what it means
> until I read your description. Please continue to use the term
> "firmware", anyone already know what it means.

We've gone well past using the firmware API for firmware though, if
we use it for 802.11 to replace CRDA for instance its really odd to
be calling it firmware. But sure... I will rebrand again to firmware...

  Luis
Arend van Spriel Aug. 3, 2016, 11:33 a.m. UTC | #22
On 02-08-16 16:16, Luis R. Rodriguez wrote:
> On Tue, Aug 02, 2016 at 11:10:22AM +0000, Valo, Kalle wrote:
>> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
>>
>>> I was considering this as a future extension to the firmware API
>>> through the new extensible firmware API, the sysdata API.
>>
>> I think Linus mentioned this already, but I want to reiterate anyway.
>> The name "sysdata" is horrible, I didn't have any idea what it means
>> until I read your description. Please continue to use the term
>> "firmware", anyone already know what it means.
> 
> We've gone well past using the firmware API for firmware though, if
> we use it for 802.11 to replace CRDA for instance its really odd to
> be calling it firmware. But sure... I will rebrand again to firmware...

I tend to agree. Although some people even call an OpenWrt image
firmware. Guess it is just in the eye of the beholder.

Regards,
Arend
Luis Chamberlain Aug. 3, 2016, 2:21 p.m. UTC | #23
On Wed, Aug 03, 2016 at 01:33:31PM +0200, Arend van Spriel wrote:
> On 02-08-16 16:16, Luis R. Rodriguez wrote:
> > On Tue, Aug 02, 2016 at 11:10:22AM +0000, Valo, Kalle wrote:
> >> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> >>
> >>> I was considering this as a future extension to the firmware API
> >>> through the new extensible firmware API, the sysdata API.
> >>
> >> I think Linus mentioned this already, but I want to reiterate anyway.
> >> The name "sysdata" is horrible, I didn't have any idea what it means
> >> until I read your description. Please continue to use the term
> >> "firmware", anyone already know what it means.
> > 
> > We've gone well past using the firmware API for firmware though, if
> > we use it for 802.11 to replace CRDA for instance its really odd to
> > be calling it firmware. But sure... I will rebrand again to firmware...
> 
> I tend to agree. Although some people even call an OpenWrt image
> firmware. Guess it is just in the eye of the beholder.

Sure...

Come to think of it I'll still go with "sysdata", this is a very minor
detail, do let me know if there is anything technical rather than
the color of the bikeshed [0] over the patches.

If you'd like another color in my bikeshed please let me know what
color exactly you prefer and I'll consider it.

[0] http://phk.freebsd.dk/sagas/bikeshed.html

  Luis
Kalle Valo Aug. 3, 2016, 3:04 p.m. UTC | #24
"Luis R. Rodriguez" <mcgrof@kernel.org> writes:

> On Wed, Aug 03, 2016 at 01:33:31PM +0200, Arend van Spriel wrote:
>> On 02-08-16 16:16, Luis R. Rodriguez wrote:
>> > On Tue, Aug 02, 2016 at 11:10:22AM +0000, Valo, Kalle wrote:
>> >> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
>> >>
>> >>> I was considering this as a future extension to the firmware API
>> >>> through the new extensible firmware API, the sysdata API.
>> >>
>> >> I think Linus mentioned this already, but I want to reiterate anyway.
>> >> The name "sysdata" is horrible, I didn't have any idea what it means
>> >> until I read your description. Please continue to use the term
>> >> "firmware", anyone already know what it means.
>> > 
>> > We've gone well past using the firmware API for firmware though, if
>> > we use it for 802.11 to replace CRDA for instance its really odd to
>> > be calling it firmware. But sure... I will rebrand again to firmware...
>> 
>> I tend to agree. Although some people even call an OpenWrt image
>> firmware. Guess it is just in the eye of the beholder.
>
> Sure...
>
> Come to think of it I'll still go with "sysdata", this is a very minor
> detail, do let me know if there is anything technical rather than
> the color of the bikeshed [0] over the patches.

Well, you don't seem to care but I prefer that the terminology is clear
and I don't want to waste people's time browsing the source to find out
what something means. Even "driverdata" would be more descriptive for me
than "sysdata".

Actually, what does the "sys" refer here, system? And what system is
that exactly?
Luis Chamberlain Aug. 3, 2016, 5:10 p.m. UTC | #25
On Wed, Aug 03, 2016 at 03:04:39PM +0000, Valo, Kalle wrote:
> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> 
> > On Wed, Aug 03, 2016 at 01:33:31PM +0200, Arend van Spriel wrote:
> >> On 02-08-16 16:16, Luis R. Rodriguez wrote:
> >> > On Tue, Aug 02, 2016 at 11:10:22AM +0000, Valo, Kalle wrote:
> >> >> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
> >> >>
> >> >>> I was considering this as a future extension to the firmware API
> >> >>> through the new extensible firmware API, the sysdata API.
> >> >>
> >> >> I think Linus mentioned this already, but I want to reiterate anyway.
> >> >> The name "sysdata" is horrible, I didn't have any idea what it means
> >> >> until I read your description. Please continue to use the term
> >> >> "firmware", anyone already know what it means.
> >> > 
> >> > We've gone well past using the firmware API for firmware though, if
> >> > we use it for 802.11 to replace CRDA for instance its really odd to
> >> > be calling it firmware. But sure... I will rebrand again to firmware...
> >> 
> >> I tend to agree. Although some people even call an OpenWrt image
> >> firmware. Guess it is just in the eye of the beholder.
> >
> > Sure...
> >
> > Come to think of it I'll still go with "sysdata", this is a very minor
> > detail, do let me know if there is anything technical rather than
> > the color of the bikeshed [0] over the patches.
> 
> Well, you don't seem to care but I prefer that the terminology is clear
> and I don't want to waste people's time browsing the source to find out
> what something means.

Its not that I don't care, its this is a super trivial matter, like the
color of a bikeshed, and I'd much prefer to put energy and review on
technical matters.

> Even "driverdata" would be more descriptive for me
> than "sysdata".
> 
> Actually, what does the "sys" refer here, system? And what system is
> that exactly?

Yes system, so as in system data file. "driver_data" is just as good.
Although who knows, others may want to paint the bikeshed a different
color.

The core reason why the name change is to make emphasis of the fact that
we've gone way past the point where the APIs are used for non-firmware
and I expect this use will only grow.

  Luis
Arend van Spriel Aug. 3, 2016, 7:19 p.m. UTC | #26
On 03-08-16 19:10, Luis R. Rodriguez wrote:
> On Wed, Aug 03, 2016 at 03:04:39PM +0000, Valo, Kalle wrote:
>> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
>>
>>> On Wed, Aug 03, 2016 at 01:33:31PM +0200, Arend van Spriel wrote:
>>>> On 02-08-16 16:16, Luis R. Rodriguez wrote:
>>>>> On Tue, Aug 02, 2016 at 11:10:22AM +0000, Valo, Kalle wrote:
>>>>>> "Luis R. Rodriguez" <mcgrof@kernel.org> writes:
>>>>>>
>>>>>>> I was considering this as a future extension to the firmware API
>>>>>>> through the new extensible firmware API, the sysdata API.
>>>>>>
>>>>>> I think Linus mentioned this already, but I want to reiterate anyway.
>>>>>> The name "sysdata" is horrible, I didn't have any idea what it means
>>>>>> until I read your description. Please continue to use the term
>>>>>> "firmware", anyone already know what it means.
>>>>>
>>>>> We've gone well past using the firmware API for firmware though, if
>>>>> we use it for 802.11 to replace CRDA for instance its really odd to
>>>>> be calling it firmware. But sure... I will rebrand again to firmware...
>>>>
>>>> I tend to agree. Although some people even call an OpenWrt image
>>>> firmware. Guess it is just in the eye of the beholder.
>>>
>>> Sure...
>>>
>>> Come to think of it I'll still go with "sysdata", this is a very minor
>>> detail, do let me know if there is anything technical rather than
>>> the color of the bikeshed [0] over the patches.
>>
>> Well, you don't seem to care but I prefer that the terminology is clear
>> and I don't want to waste people's time browsing the source to find out
>> what something means.
> 
> Its not that I don't care, its this is a super trivial matter, like the
> color of a bikeshed, and I'd much prefer to put energy and review on
> technical matters.
> 
>> Even "driverdata" would be more descriptive for me
>> than "sysdata".
>>
>> Actually, what does the "sys" refer here, system? And what system is
>> that exactly?
> 
> Yes system, so as in system data file. "driver_data" is just as good.
> Although who knows, others may want to paint the bikeshed a different
> color.
> 
> The core reason why the name change is to make emphasis of the fact that
> we've gone way past the point where the APIs are used for non-firmware
> and I expect this use will only grow.

Here some colors to add to the palet with "color code" in brackets:
- device specific data (dsd): although if CRDA stuff is loaded it is no
	longer tied to a device.
- eXtensible firmware API (xfw): the ever popular x factor :-p
- binary blob loader (bbl): just popped up in me head.

Need more beer to go on.

Regards,
Arend
Kalle Valo Jan. 20, 2017, 12:51 p.m. UTC | #27
Michal Kazior <michal.kazior@tieto.com> wrote:
> Firmware files are versioned to prevent older
> driver instances to load unsupported firmware
> blobs. This is reflected with a fallback logic
> which attempts to load several firmware files.
> 
> This however produced a lot of unnecessary
> warnings sometimes confusing users and leading
> them to rename firmware files making things even
> more confusing.
> 
> Hence use request_firmware_direct() which does not
> produce extra warnings. This shouldn't really
> break anything because most modern systems don't
> rely on udev/hotplug helpers to load firmware
> files anymore.
> 
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>

This ended into a rather long discussion, see the full thread from the patchwork link
below, but I'll try to summarise it here:

* Nobody stepped up and mentioned that they need/use the user fallback helper with ath10k.

* Felix confirmed that LEDE creates the calibration file before loading ath10k
  so this should not break LEDE.

* This also fixes a 60 second delay per _each_ unexistent firmware/calibration
  file with distros which have CONFIG_FW_LOADER_USER_HELPER_FALLBACK enabled,
  RHEL being a notable example. Using ath10k with firmware-2.bin this might
  end up into a five minute delay in boot.

* Luis is working on new drvdata interface for kernel, but that's not merged yet.

Based on this I think the right approach is to apply this patch. Any concerns?

While writing this I started to suspect is it just by accident that
request_firmware_direct() does not print any error messages and
request_firmware() again does print those? Let's hope nobody decides to change
that.  And at least Luis' drvdata interface has a documented 'optional' flag,
so we can always switch to using that (once it's merged):

* struct drvdata_req_params - driver data request parameters
* @optional: if true it is not a hard requirement by the caller that this
*	file be present. An error will not be recorded if the file is not
*	found.

Michal, do you mind if I'll add more info to the commit log and submit this RFC
as a proper patch? It still seems to apply and work just fine.
Michal Kazior Jan. 20, 2017, 12:56 p.m. UTC | #28
On 20 January 2017 at 13:51, Kalle Valo <kvalo@qca.qualcomm.com> wrote:
> Michal Kazior <michal.kazior@tieto.com> wrote:
>> Firmware files are versioned to prevent older
>> driver instances to load unsupported firmware
>> blobs. This is reflected with a fallback logic
>> which attempts to load several firmware files.
>>
>> This however produced a lot of unnecessary
>> warnings sometimes confusing users and leading
>> them to rename firmware files making things even
>> more confusing.
>>
>> Hence use request_firmware_direct() which does not
>> produce extra warnings. This shouldn't really
>> break anything because most modern systems don't
>> rely on udev/hotplug helpers to load firmware
>> files anymore.
>>
>> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
>
> This ended into a rather long discussion, see the full thread from the patchwork link
> below, but I'll try to summarise it here:
>
> * Nobody stepped up and mentioned that they need/use the user fallback helper with ath10k.
>
> * Felix confirmed that LEDE creates the calibration file before loading ath10k
>   so this should not break LEDE.
>
> * This also fixes a 60 second delay per _each_ unexistent firmware/calibration
>   file with distros which have CONFIG_FW_LOADER_USER_HELPER_FALLBACK enabled,
>   RHEL being a notable example. Using ath10k with firmware-2.bin this might
>   end up into a five minute delay in boot.
>
> * Luis is working on new drvdata interface for kernel, but that's not merged yet.
>
> Based on this I think the right approach is to apply this patch. Any concerns?
>
> While writing this I started to suspect is it just by accident that
> request_firmware_direct() does not print any error messages and
> request_firmware() again does print those? Let's hope nobody decides to change
> that.  And at least Luis' drvdata interface has a documented 'optional' flag,
> so we can always switch to using that (once it's merged):
>
> * struct drvdata_req_params - driver data request parameters
> * @optional: if true it is not a hard requirement by the caller that this
> *       file be present. An error will not be recorded if the file is not
> *       found.
>
> Michal, do you mind if I'll add more info to the commit log and submit this RFC
> as a proper patch? It still seems to apply and work just fine.

I don't mind :)


MichaƂ
Kalle Valo Jan. 31, 2017, 3:02 p.m. UTC | #29
Michal Kazior <michal.kazior@tieto.com> wrote:
> Firmware files are versioned to prevent older
> driver instances to load unsupported firmware
> blobs. This is reflected with a fallback logic
> which attempts to load several firmware files.
> 
> This however produced a lot of unnecessary
> warnings sometimes confusing users and leading
> them to rename firmware files making things even
> more confusing.
> 
> Hence use request_firmware_direct() which does not
> produce extra warnings. This shouldn't really
> break anything because most modern systems don't
> rely on udev/hotplug helpers to load firmware
> files anymore.
> 
> Signed-off-by: Michal Kazior <michal.kazior@tieto.com>

While testing Erik's usb patches I noticed one problem if the firmware (or
board file) is not found:

[  517.389399] usbcore: registered new interface driver ath10k_usb
[  517.391756] usb 2-1.3: could not fetch firmware files (-2)
[  517.391985] usb 2-1.3: could not probe fw (-2)

Now the user has no way to know what file is exactly missing. I'll try to
improve that in v2.
diff mbox

Patch

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index e88982921aa3..81bfb71fe876 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -431,7 +431,10 @@  static const struct firmware *ath10k_fetch_fw_file(struct ath10k *ar,
 		dir = ".";
 
 	snprintf(filename, sizeof(filename), "%s/%s", dir, file);
-	ret = request_firmware(&fw, filename, ar->dev);
+	ret = request_firmware_direct(&fw, filename, ar->dev);
+	ath10k_dbg(ar, ATH10K_DBG_BOOT, "boot fw request '%s': %d\n",
+		   filename, ret);
+
 	if (ret)
 		return ERR_PTR(ret);
 
@@ -1089,12 +1092,8 @@  int ath10k_core_fetch_firmware_api_n(struct ath10k *ar, const char *name,
 	/* first fetch the firmware file (firmware-*.bin) */
 	fw_file->firmware = ath10k_fetch_fw_file(ar, ar->hw_params.fw.dir,
 						 name);
-	if (IS_ERR(fw_file->firmware)) {
-		ath10k_err(ar, "could not fetch firmware file '%s/%s': %ld\n",
-			   ar->hw_params.fw.dir, name,
-			   PTR_ERR(fw_file->firmware));
+	if (IS_ERR(fw_file->firmware))
 		return PTR_ERR(fw_file->firmware);
-	}
 
 	data = fw_file->firmware->data;
 	len = fw_file->firmware->size;
diff --git a/drivers/net/wireless/ath/ath10k/testmode.c b/drivers/net/wireless/ath/ath10k/testmode.c
index 120f4234d3b0..fe49e7a83d00 100644
--- a/drivers/net/wireless/ath/ath10k/testmode.c
+++ b/drivers/net/wireless/ath/ath10k/testmode.c
@@ -149,7 +149,10 @@  static int ath10k_tm_fetch_utf_firmware_api_1(struct ath10k *ar,
 		 ar->hw_params.fw.dir, ATH10K_FW_UTF_FILE);
 
 	/* load utf firmware image */
-	ret = request_firmware(&fw_file->firmware, filename, ar->dev);
+	ret = request_firmware_direct(&fw_file->firmware, filename, ar->dev);
+	ath10k_dbg(ar, ATH10K_DBG_TESTMODE, "testmode fw request '%s': %d\n",
+		   filename, ret);
+
 	if (ret) {
 		ath10k_warn(ar, "failed to retrieve utf firmware '%s': %d\n",
 			    filename, ret);