diff mbox

[RFC,2/5] iwlwifi: fix request_module() use

Message ID 20170221022337.GG31264@wotan.suse.de (mailing list archive)
State Superseded
Delegated to: Luca Coelho
Headers show

Commit Message

Luis Chamberlain Feb. 21, 2017, 2:23 a.m. UTC
On Sun, Feb 19, 2017 at 09:47:59AM +0000, Grumbach, Emmanuel wrote:
> > 
> > The return value of request_module() being 0 does not mean that the driver
> > which was requested has loaded. To properly check that the driver was
> > loaded each driver can use internal mechanisms to vet the driver is now
> > present. The helper try_then_request_module() was added to help with
> > this, allowing drivers to specify their own validation as the first argument.
> > 
> > On iwlwifi the use case is a bit more complicated given that the value we
> > need to check for is protected with a mutex later used on the
> > module_init() of the module we are asking for. If we were to lock and
> > request_module() we'd deadlock. iwlwifi needs its own wrapper then so it
> > can handle its special locking requirements.
> > 
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> 
> I don't see the problem with the current code. We don't assume that everything has been
> loaded immediately after request_module returns. We just free the intermediate firmware
> structures that won't be using anymore. What happens here is that after request_module
> returns, we patiently wait until it is loaded, and when that happens, iwl{d,m}vm's init function
> will be called.

Right I get that.

The code today complains if its respective opmode module was not loaded
if request_module() did not return 0. As the commit log explains, relying
on a return code of 0 to ensure a module loads is not sufficient. So the
current print is almost pointless, so best we either:

a) just remove the print and use instead request_module_nowait() (this is more
   in alignment of what your code actually does today; or

b) fix the request_module() use so that the error print matches the expected
   and proper recommended use of request_module() (what this patch does)

I prefer a) actually but I had to show what b) looked like first :)

The only issue with a) is today we have no *slim* way to let drivers
load a module asynchronously and then later verify it did get loaded.
From a *quick* look this grammatical form of request_module_nowait() and
a verifier is essentially is not widely popular or not present at all
today. A verifier seems reasonable if you use request_module_nowait()
though and want to be pedantic about ensuring the module is there.

What this might look like for iwlwifi? Something like this:


So consider this for your driver -- if you agree today's print is rather
pointless upon failure then you'd be OK in just using request_module_nowait()
and removing the print -- and not adding a verifier step 2 like the above.

Only -- it seems you want a verifier.

So you have 2 options with a suboptions:

  1) keep sync request and add the verifier -- as in the original patch in this e-mail
  2) use async request and
	2a) add verifier
	2b) ignore the verifier

I don't see why you'd want 2b) is what I'm trying to say and the point of this
path is to show what a 1) would look like.

The point of this email also is to highlight what it would look like in general
if we wanted verifiers for module request for async_schedule() calls, given
they cannot use request_module() and *must* use request_module_nowait() and
that ultimately begs the question if they want verifiers or not as well.

For iwlwifi and wireless this is only generally relevant for the async
callback for firmware requests, but it seems only iwlwifi uses this form.
I used async_schedule() for the driver data API which I'm developing,
and as such 2a or a solution for 1) was needed in such a way it was
compatible.

> That one is the one that continues the flow by calling:
> 
> 	ret = iwl_opmode_register("iwlmvm", &iwl_mvm_ops);
> 
> (for the iwlmvm case).
> 
> Where am I wrong here?

You are right, but note the 2 possible ways in which the alternative path
can be taken in the prior code we discussed, this should ensure we complain
if upon a load the module is really not present. From what I recall from
my testing it turns out though that in practice this *still* is still allowing
for the case where iwlmvm loads prior to iwlwifi's async fw callback code
checks for the opmode. One can test this with a loop of:

modprobe -r iwlmvm; while true; do modprobe iwlmvm; modprobe -r iwlmvm; dmesg -c && echo ; done

Prior to this add a check for an empty list on the opmode registration, if its
empty then we've hit the path discussed. In the loop above we are *triggering*
the load of iwlmvm first, that's why it *can* load first. If we wanted to ensure
iwlmvm and other other opmode load second we can add a simply symbol dependency
from the opmodes to the iwlwifi driver.

  Luis

Comments

Emmanuel Grumbach Feb. 21, 2017, 7:16 a.m. UTC | #1
> 
> On Sun, Feb 19, 2017 at 09:47:59AM +0000, Grumbach, Emmanuel wrote:
> > >
> > > The return value of request_module() being 0 does not mean that the
> > > driver which was requested has loaded. To properly check that the
> > > driver was loaded each driver can use internal mechanisms to vet the
> > > driver is now present. The helper try_then_request_module() was
> > > added to help with this, allowing drivers to specify their own validation as
> the first argument.
> > >
> > > On iwlwifi the use case is a bit more complicated given that the
> > > value we need to check for is protected with a mutex later used on
> > > the
> > > module_init() of the module we are asking for. If we were to lock
> > > and
> > > request_module() we'd deadlock. iwlwifi needs its own wrapper then
> > > so it can handle its special locking requirements.
> > >
> > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> >
> > I don't see the problem with the current code. We don't assume that
> > everything has been loaded immediately after request_module returns.
> > We just free the intermediate firmware structures that won't be using
> > anymore. What happens here is that after request_module returns, we
> > patiently wait until it is loaded, and when that happens, iwl{d,m}vm's init
> function will be called.
> 
> Right I get that.
> 
> The code today complains if its respective opmode module was not loaded if
> request_module() did not return 0. As the commit log explains, relying on a
> return code of 0 to ensure a module loads is not sufficient. So the current
> print is almost pointless, so best we either:
> 
> a) just remove the print and use instead request_module_nowait() (this is
> more
>    in alignment of what your code actually does today; or
> 
> b) fix the request_module() use so that the error print matches the
> expected
>    and proper recommended use of request_module() (what this patch does)
> 
> I prefer a) actually but I had to show what b) looked like first :)
> 

Me too. Let's do the simple thing. After all, it's been working for 5 years now (maybe more?)
and I don't see a huge need to verify that the opmode module has been loaded.
It is very unlikely to fail anyway, and in the case it did fail, it's not that we can do much
from iwlwifi point of view. iwlwifi will stay loaded and sit idle since no opmode will
be there to start using the hardware. We will keep having the device claimed, and will
keep the interrupt registered and all that. No WiFi for you, but no harm caused either.
Luis Chamberlain Feb. 21, 2017, 6:15 p.m. UTC | #2
On Tue, Feb 21, 2017 at 07:16:16AM +0000, Grumbach, Emmanuel wrote:
> > 
> > a) just remove the print and use instead request_module_nowait() (this is
> > more in alignment of what your code actually does today; or
> > 
> > b) fix the request_module() use so that the error print matches the
> > expected and proper recommended use of request_module() (what this patch
> > does)
> > 
> > I prefer a) actually but I had to show what b) looked like first :)
>
> Me too. Let's do the simple thing. After all, it's been working for 5 years
> now (maybe more?) and I don't see a huge need to verify that the opmode
> module has been loaded.  It is very unlikely to fail anyway, and in the case
> it did fail, it's not that we can do much from iwlwifi point of view. 

I tend to agree with you on this, retries would be the only sensible thing to
do, but why do that -- the error should be logged right and addressed by any
upper layers. Its one reason to consider in the future adding verifiers
as built-in optional part of module loading.

> iwlwifi will stay loaded and sit idle since no opmode will be there to start
> using the hardware. We will keep having the device claimed, and will keep the
> interrupt registered and all that. No WiFi for you, but no harm caused
> either.

Fine by me. Will send follow up simple patches.

  Luis
Luis Chamberlain Feb. 21, 2017, 8:17 p.m. UTC | #3
On Tue, Feb 21, 2017 at 07:15:41PM +0100, Luis R. Rodriguez wrote:
> On Tue, Feb 21, 2017 at 07:16:16AM +0000, Grumbach, Emmanuel wrote:
> > > 
> > > a) just remove the print and use instead request_module_nowait() (this is
> > > more in alignment of what your code actually does today; or
> > > 
> > > b) fix the request_module() use so that the error print matches the
> > > expected and proper recommended use of request_module() (what this patch
> > > does)
> > > 
> > > I prefer a) actually but I had to show what b) looked like first :)
> >
> > Me too. Let's do the simple thing. After all, it's been working for 5 years
> > now (maybe more?) and I don't see a huge need to verify that the opmode
> > module has been loaded.  It is very unlikely to fail anyway, and in the case
> > it did fail, it's not that we can do much from iwlwifi point of view. 
> 
> I tend to agree with you on this, retries would be the only sensible thing to
> do, but why do that -- the error should be logged right and addressed by any
> upper layers. Its one reason to consider in the future adding verifiers
> as built-in optional part of module loading.

It would seem we still need to offload the opmode start as it is the one that
really should be issuing the completion, otherwise we would end up sending a
completion while the opmode module is being loaded asynchronously. The changes
are for that are still very likely desirable as it should help with speeding
boot up.

So the sharing of the opcode start will go first.

Will send v2.

  Luis
Luis Chamberlain Feb. 22, 2017, 12:18 a.m. UTC | #4
On Tue, Feb 21, 2017 at 09:17:15PM +0100, Luis R. Rodriguez wrote:
> On Tue, Feb 21, 2017 at 07:15:41PM +0100, Luis R. Rodriguez wrote:
> > On Tue, Feb 21, 2017 at 07:16:16AM +0000, Grumbach, Emmanuel wrote:
> > > > 
> > > > a) just remove the print and use instead request_module_nowait() (this is
> > > > more in alignment of what your code actually does today; or
> > > > 
> > > > b) fix the request_module() use so that the error print matches the
> > > > expected and proper recommended use of request_module() (what this patch
> > > > does)
> > > > 
> > > > I prefer a) actually but I had to show what b) looked like first :)
> > >
> > > Me too. Let's do the simple thing. After all, it's been working for 5 years
> > > now (maybe more?) and I don't see a huge need to verify that the opmode
> > > module has been loaded.  It is very unlikely to fail anyway, and in the case
> > > it did fail, it's not that we can do much from iwlwifi point of view. 
> > 
> > I tend to agree with you on this, retries would be the only sensible thing to
> > do, but why do that -- the error should be logged right and addressed by any
> > upper layers. Its one reason to consider in the future adding verifiers
> > as built-in optional part of module loading.
> 
> It would seem we still need to offload the opmode start as it is the one that
> really should be issuing the completion, otherwise we would end up sending a
> completion while the opmode module is being loaded asynchronously. The changes
> are for that are still very likely desirable as it should help with speeding
> boot up.
> 
> So the sharing of the opcode start will go first.
> 
> Will send v2.

Actually the completion was always being sent prior to request_module(), so this
would not change anything really. The sharing of the opcode then is optional,
and I can send separately in another series.

  Luis
Luis Chamberlain Feb. 22, 2017, 2:09 a.m. UTC | #5
This v2 addresses the preference to keep things simple on iwlwifi when
requesting modules and not implementing a verifier for loaing the opmode
module. We now know what a verifier looks like for both sync and async
approaches. The already established long standing practice of just doing
best effort to load suffices and keeps the driver cleaner.

There no change to the first patch. The second patch just embraces
request_module_nowait() instead of implementing a verifier for a sync
call. The remaining patches from the last series will be sent separately.

Luis R. Rodriguez (2):
  iwlwifi: fix drv cleanup on opmode registration failure
  iwlwifi: simplify requesting ops module

 drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 28 ++++++++++------------------
 1 file changed, 10 insertions(+), 18 deletions(-)
Luis Chamberlain Feb. 22, 2017, 2:10 a.m. UTC | #6
This v2 split off the opmode handling sharing code into its
own series, it however depends on the request_module_nowait()
change.

The sharing of the opmode handling makes it easier to share fixes
when dealing with opmode handling on devices. It should also hopefully
make the code easier to grok. Lastly, since we are moving things to
a workqueue naturally the module_init() for iwlmvm is not offloaded,
and so this should reduce the boot time by a bit.

As per the average of systemd-analyze on 5 boots using next-20170221
as base:

next-20170221:
Startup finished in   2.6142s (kernel) +   5.1916s (initrd) +  10.8968s (userspace) =  18.7036s

After these patches:
Startup finished in   2.5468s (kernel) +   4.9536s (initrd) +   10.798s (userspace) =  18.2994s

Luis R. Rodriguez (2):
  iwlwifi: share opmode start work code
  iwlwifi: convert final opmode work into a workqueue

 drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 93 +++++++++++++++++++---------
 1 file changed, 64 insertions(+), 29 deletions(-)
diff mbox

Patch

diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
index be466a074c1d..8059e1dab061 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
@@ -137,11 +137,16 @@  static struct iwlwifi_opmode_table {
 	const char *name;			/* name: iwldvm, iwlmvm, etc */
 	const struct iwl_op_mode_ops *ops;	/* pointer to op_mode ops */
 	struct list_head drv;		/* list of devices using this op_mode */
+	bool load_requested;		/* Do we need to load a driver ? */
+	struct iwl_drv *drv_req;	/* Device that set load_requested */
 } iwlwifi_opmode_table[] = {		/* ops set when driver is initialized */
 	[DVM_OP_MODE] = { .name = "iwldvm", .ops = NULL },
 	[MVM_OP_MODE] = { .name = "iwlmvm", .ops = NULL },
 };
 
+static void iwlwifi_load_opmode(struct work_struct *work);
+static DECLARE_DELAYED_WORK(iwl_opload_work, iwlwifi_load_opmode);
+
 #define IWL_DEFAULT_SCAN_CHANNELS 40
 
 /*
@@ -1231,6 +1236,43 @@  static void _iwl_op_mode_stop(struct iwl_drv *drv)
 	}
 }
 
+static void iwlwifi_load_opmode(struct work_struct *work)
+{
+	struct iwl_drv *drv = NULL;
+	struct iwlwifi_opmode_table *op;
+	unsigned int i;
+
+	mutex_lock(&iwlwifi_opmode_table_mtx);
+	for (i = 0; i < ARRAY_SIZE(iwlwifi_opmode_table); i++) {
+		op = &iwlwifi_opmode_table[i];
+		if (!op->load_requested)
+			continue;
+		drv = op->drv_req;
+
+		if (!op->ops && drv) {
+			IWL_ERR(drv,
+				"failed to load module %s, is dynamic loading enabled?\n",
+				op->name);
+			complete(&drv->request_firmware_complete);
+			device_release_driver(drv->trans->dev);
+			mutex_unlock(&iwlwifi_opmode_table_mtx);
+			return;
+		}
+
+		op->load_requested = false;
+		op->drv_req = NULL;
+	}
+	mutex_unlock(&iwlwifi_opmode_table_mtx);
+
+
+	/*
+	 * Complete the firmware request last so that
+	 * a driver unbind (stop) doesn't run while we
+	 * are doing the opmode start().
+	 */
+	complete(&drv->request_firmware_complete);
+}
+
 /**
  * iwl_req_fw_callback - callback when firmware was loaded
  *
@@ -1250,7 +1292,6 @@  static void iwl_req_fw_callback(const struct firmware *ucode_raw, void *context)
 	size_t trigger_tlv_sz[FW_DBG_TRIGGER_MAX];
 	u32 api_ver;
 	int i;
-	bool load_module = false;
 	bool usniffer_images = false;
 
 	fw->ucode_capa.max_probe_length = IWL_DEFAULT_MAX_PROBE_LENGTH;
@@ -1455,31 +1496,26 @@  static void iwl_req_fw_callback(const struct firmware *ucode_raw, void *context)
 			goto out_unbind;
 		}
 	} else {
-		load_module = true;
+		op->load_requested = true;
+		op->drv_req = drv;
 	}
 	mutex_unlock(&iwlwifi_opmode_table_mtx);
 
 	/*
-	 * Complete the firmware request last so that
-	 * a driver unbind (stop) doesn't run while we
-	 * are doing the start() above.
-	 */
-	complete(&drv->request_firmware_complete);
-
-	/*
 	 * Load the module last so we don't block anything
 	 * else from proceeding if the module fails to load
 	 * or hangs loading.
+	 *
+	 * Always try loading it, even if we were built-in as
+	 * in built-in cases this will be a no-op and so will
+	 * the verifier check.
 	 */
-	if (load_module) {
-		err = request_module("%s", op->name);
-#ifdef CONFIG_IWLWIFI_OPMODE_MODULAR
-		if (err)
-			IWL_ERR(drv,
-				"failed to load module %s (error %d), is dynamic loading enabled?\n",
-				op->name, err);
-#endif
-	}
+	err = request_module_nowait("%s", op->name);
+	if (err)
+		goto out_unbind;
+
+	schedule_delayed_work(&iwl_opload_work, HZ);
+
 	goto free;
 
  try_again: