diff mbox

[BUG,3.12.rc4] Oops: unable to handle kernel paging request during shutdown

Message ID 1483174.NsuFf8aJk6@vostro.rjw.lan (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Rafael J. Wysocki Oct. 25, 2013, 9:51 a.m. UTC
On Friday, October 25, 2013 11:28:02 AM Rafael J. Wysocki wrote:
> On Friday, October 25, 2013 10:02:22 AM Linus Torvalds wrote:
> > Adding more people, so quoting the whole email for them.
> > 
> > We definitely have some module unload issues. Guys, try the following
> > a few times to unload modules:
> > 
> >     lsmod | grep ' 0 '| cut -d' ' -f1 | xargs sudo rmmod
> > 
> > (a few times because unloading one module will then potentially make
> > other modules unloadable).
> > 
> > On my machine, I can trigger this, for example:
> > 
> >   ------------[ cut here ]------------
> >   WARNING: CPU: 0 PID: 3217 at fs/sysfs/file.c:498 sysfs_attr_ns+0x91/0xa0()
> >   sysfs: kobject (null) without dirent
> >   Modules linked in: fuse nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_$
> >   CPU: 0 PID: 3217 Comm: rmmod Not tainted 3.12.0-rc6-00284-ge6036c0b8896 #19
> >   Hardware name: Sony Corporation SVP11213CXB/VAIO, BIOS R0270V7 05/17/2013
> >    0000000000000009 ffff8800aca35df8 ffffffff8160aab5 ffff8800aca35e40
> >    ffff8800aca35e30 ffffffff810514b8 ffffffffa013f080 ffff8801194a6040
> >    0000000000000800 0000000000000000 0000000000c5b3e0 ffff8800aca35e90
> >   Call Trace:
> >    [<ffffffff8160aab5>] dump_stack+0x45/0x56
> >    [<ffffffff810514b8>] warn_slowpath_common+0x78/0xa0
> >    [<ffffffff81051527>] warn_slowpath_fmt+0x47/0x50
> >    [<ffffffff810b5960>] ? module_refcount+0xb0/0xb0
> >    [<ffffffff811e5c61>] sysfs_attr_ns+0x91/0xa0
> >    [<ffffffff811e5d2a>] sysfs_remove_file+0x1a/0x50
> >    [<ffffffff814c88a3>] cpufreq_sysfs_remove_file+0x13/0x30
> >    [<ffffffffa013d350>] acpi_cpufreq_exit+0x2e/0xcde [acpi_cpufreq]
> >    [<ffffffff810b7d1d>] SyS_delete_module+0x15d/0x2c0
> >    [<ffffffff81002929>] ? do_notify_resume+0x59/0x90
> >    [<ffffffff81618f62>] system_call_fastpath+0x16/0x1b
> >   ---[ end trace f887112caaa5c4ab ]---
> > 
> > so at least we have a cpufreq/sysfs interaction bug. There may be others.
> > 
> > This particular cpufreq issue may be triggered by the fact that
> > acpi-cpufreq isn't actually in use (pstate is). Or it might be some
> > generic cpufreq/sysfs bug. Rafael, Greg, ideas?
> 
> I *think* that this indeed is related to acpi-cpufreq being unused.  That said,
> we've been fixing sysfs-related bugs in cpufreq recently and we may have
> overlooked something.
> 
> I'll have a deeper look at that.

Well, if the ACPI cpufreq driver is not registered, the exit function of the
module shouldn't try to unregister it, so I have the appended patch (untested)
to fix that particular thing.

Rafael


---
 drivers/cpufreq/acpi-cpufreq.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Viresh Kumar Oct. 25, 2013, 9:54 a.m. UTC | #1
On 25 October 2013 15:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Friday, October 25, 2013 11:28:02 AM Rafael J. Wysocki wrote:
>> On Friday, October 25, 2013 10:02:22 AM Linus Torvalds wrote:

>> > This particular cpufreq issue may be triggered by the fact that
>> > acpi-cpufreq isn't actually in use (pstate is). Or it might be some
>> > generic cpufreq/sysfs bug. Rafael, Greg, ideas?
>>
>> I *think* that this indeed is related to acpi-cpufreq being unused.  That said,
>> we've been fixing sysfs-related bugs in cpufreq recently and we may have
>> overlooked something.

I agree.. Recently I have tested few other cpufreq drivers for module
insert/removal along with governors insertion/removal... So that part must
be okay..

> Well, if the ACPI cpufreq driver is not registered, the exit function of the
> module shouldn't try to unregister it, so I have the appended patch (untested)
> to fix that particular thing.
>
> Rafael
>
>
> ---
>  drivers/cpufreq/acpi-cpufreq.c |   10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> Index: linux-pm/drivers/cpufreq/acpi-cpufreq.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/acpi-cpufreq.c
> +++ linux-pm/drivers/cpufreq/acpi-cpufreq.c
> @@ -982,6 +982,8 @@ static void __exit acpi_cpufreq_boost_ex
>         }
>  }
>
> +static bool driver_registered;
> +
>  static int __init acpi_cpufreq_init(void)
>  {
>         int ret;
> @@ -1021,10 +1023,12 @@ static int __init acpi_cpufreq_init(void
>  #endif
>
>         ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> -       if (ret)
> +       if (ret) {
>                 free_acpi_perf_data();
> -       else
> +       } else {
>                 acpi_cpufreq_boost_init();
> +               driver_registered = true;
> +       }
>
>         return ret;
>  }
> @@ -1032,6 +1036,8 @@ static int __init acpi_cpufreq_init(void
>  static void __exit acpi_cpufreq_exit(void)
>  {
>         pr_debug("acpi_cpufreq_exit\n");
> +       if (!driver_registered)
> +               return;
>
>         acpi_cpufreq_boost_exit();

Looks like the right solution here. But this kind of issues look to
be somewhat generic, doesn't they? And probably most of the
drivers would be struggling with such issues.. They are working
because we normally have something like this in core unregister
parts:

int cpufreq_unregister_driver(struct cpufreq_driver *driver)
{
...
        if (!cpufreq_driver || (driver != cpufreq_driver))
                return -EINVAL;
....

So, even in this case if we could actually check return value of
cpufreq_unregister_driver() and then do the other stuff, then we
wouldn't require this extra variable..

But the problem is the order in which things happen. Would this
be a big problem if we do unregister first and then
acpi_cpufreq_boost_exit(), based on what unregister returned?
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-pm/drivers/cpufreq/acpi-cpufreq.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/acpi-cpufreq.c
+++ linux-pm/drivers/cpufreq/acpi-cpufreq.c
@@ -982,6 +982,8 @@  static void __exit acpi_cpufreq_boost_ex
 	}
 }
 
+static bool driver_registered;
+
 static int __init acpi_cpufreq_init(void)
 {
 	int ret;
@@ -1021,10 +1023,12 @@  static int __init acpi_cpufreq_init(void
 #endif
 
 	ret = cpufreq_register_driver(&acpi_cpufreq_driver);
-	if (ret)
+	if (ret) {
 		free_acpi_perf_data();
-	else
+	} else {
 		acpi_cpufreq_boost_init();
+		driver_registered = true;
+	}
 
 	return ret;
 }
@@ -1032,6 +1036,8 @@  static int __init acpi_cpufreq_init(void
 static void __exit acpi_cpufreq_exit(void)
 {
 	pr_debug("acpi_cpufreq_exit\n");
+	if (!driver_registered)
+		return;
 
 	acpi_cpufreq_boost_exit();