Message ID | 20170531155043.3596-2-jani.nikula@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
From: Jani Nikula <jani.nikula@intel.com> Date: Wed, 31 May 2017 18:50:43 +0300 > From: Chris Wilson <chris@chris-wilson.co.uk> > > An error during suspend (e100e_pm_suspend), ... > lead to complete failure: ... > The unwind failures stems from commit 2800209994f8 ("e1000e: Refactor PM > flows"), but it may be a later patch that introduced the non-recoverable > behaviour. > > Fixes: 2800209994f8 ("e1000e: Refactor PM flows") > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > Cc: Dave Ertman <davidx.m.ertman@intel.com> > Cc: Bruce Allan <bruce.w.allan@intel.com> > Cc: intel-wired-lan@lists.osuosl.org > Cc: netdev@vger.kernel.org > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > [Jani: bikeshed repainted] > Signed-off-by: Jani Nikula <jani.nikula@intel.com> Jeff, please make sure this gets submitted to me soon. Thanks.
On 5/31/2017 18:50, Jani Nikula wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > An error during suspend (e100e_pm_suspend), > > [ 429.994338] ACPI : EC: event blocked > [ 429.994633] e1000e: EEE TX LPI TIMER: 00000011 > [ 430.955451] pci_pm_suspend(): e1000e_pm_suspend+0x0/0x30 [e1000e] returns -2 > [ 430.955454] dpm_run_callback(): pci_pm_suspend+0x0/0x140 returns -2 > [ 430.955458] PM: Device 0000:00:19.0 failed to suspend async: error -2 > [ 430.955581] PM: Some devices failed to suspend, or early wake event detected > [ 430.957709] ACPI : EC: event unblocked > > lead to complete failure: > > [ 432.585002] ------------[ cut here ]------------ > [ 432.585013] WARNING: CPU: 3 PID: 8372 at kernel/irq/manage.c:1478 __free_irq+0x9f/0x280 > [ 432.585015] Trying to free already-free IRQ 20 > [ 432.585016] Modules linked in: cdc_ncm usbnet x86_pkg_temp_thermal intel_powerclamp coretemp mii crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep lpc_ich snd_hda_core snd_pcm mei_me mei sdhci_pci sdhci i915 mmc_core e1000e ptp pps_core prime_numbers > [ 432.585042] CPU: 3 PID: 8372 Comm: kworker/u16:40 Tainted: G U 4.10.0-rc8-CI-Patchwork_3870+ #1 > [ 432.585044] Hardware name: LENOVO 2356GCG/2356GCG, BIOS G7ET31WW (1.13 ) 07/02/2012 > [ 432.585050] Workqueue: events_unbound async_run_entry_fn > [ 432.585051] Call Trace: > [ 432.585058] dump_stack+0x67/0x92 > [ 432.585062] __warn+0xc6/0xe0 > [ 432.585065] warn_slowpath_fmt+0x4a/0x50 > [ 432.585070] ? _raw_spin_lock_irqsave+0x49/0x60 > [ 432.585072] __free_irq+0x9f/0x280 > [ 432.585075] free_irq+0x34/0x80 > [ 432.585089] e1000_free_irq+0x65/0x70 [e1000e] > [ 432.585098] e1000e_pm_freeze+0x7a/0xb0 [e1000e] > [ 432.585106] e1000e_pm_suspend+0x21/0x30 [e1000e] > [ 432.585113] pci_pm_suspend+0x71/0x140 > [ 432.585118] dpm_run_callback+0x6f/0x330 > [ 432.585122] ? pci_pm_freeze+0xe0/0xe0 > [ 432.585125] __device_suspend+0xea/0x330 > [ 432.585128] async_suspend+0x1a/0x90 > [ 432.585132] async_run_entry_fn+0x34/0x160 > [ 432.585137] process_one_work+0x1f4/0x6d0 > [ 432.585140] ? process_one_work+0x16e/0x6d0 > [ 432.585143] worker_thread+0x49/0x4a0 > [ 432.585145] kthread+0x107/0x140 > [ 432.585148] ? process_one_work+0x6d0/0x6d0 > [ 432.585150] ? kthread_create_on_node+0x40/0x40 > [ 432.585154] ret_from_fork+0x2e/0x40 > [ 432.585156] ---[ end trace 6712df7f8c4b9124 ]--- > > The unwind failures stems from commit 2800209994f8 ("e1000e: Refactor PM > flows"), but it may be a later patch that introduced the non-recoverable > behaviour. > > Fixes: 2800209994f8 ("e1000e: Refactor PM flows") > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > Cc: Dave Ertman <davidx.m.ertman@intel.com> > Cc: Bruce Allan <bruce.w.allan@intel.com> > Cc: intel-wired-lan@lists.osuosl.org > Cc: netdev@vger.kernel.org > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > [Jani: bikeshed repainted] > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > --- > drivers/net/ethernet/intel/e1000e/netdev.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > index b3679728caac..5cad688be609 100644 > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > @@ -6630,12 +6630,17 @@ static int e1000e_pm_thaw(struct device *dev) > static int e1000e_pm_suspend(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > + int rc; > > e1000e_flush_lpic(pdev); > > e1000e_pm_freeze(dev); > > - return __e1000_shutdown(pdev, false); > + rc = __e1000_shutdown(pdev, false); > + if (rc) > + e1000e_pm_thaw(dev); > + > + return rc; > } > > static int e1000e_pm_resume(struct device *dev) Good. Let's pick up this patch.
On Fri, 2017-06-02 at 14:14 -0400, David Miller wrote: > From: Jani Nikula <jani.nikula@intel.com> > Date: Wed, 31 May 2017 18:50:43 +0300 > > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > An error during suspend (e100e_pm_suspend), > > ... > > lead to complete failure: > > ... > > The unwind failures stems from commit 2800209994f8 ("e1000e: > > Refactor PM > > flows"), but it may be a later patch that introduced the non- > > recoverable > > behaviour. > > > > Fixes: 2800209994f8 ("e1000e: Refactor PM flows") > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > > Cc: Dave Ertman <davidx.m.ertman@intel.com> > > Cc: Bruce Allan <bruce.w.allan@intel.com> > > Cc: intel-wired-lan@lists.osuosl.org > > Cc: netdev@vger.kernel.org > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > [Jani: bikeshed repainted] > > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > > Jeff, please make sure this gets submitted to me soon. Expect it later tonight, just finishing up testing.
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On Behalf > Of Jeff Kirsher > Sent: Tuesday, June 6, 2017 1:46 PM > To: David Miller <davem@davemloft.net>; Nikula, Jani > <jani.nikula@intel.com> > Cc: Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; daniel.vetter@ffwll.ch; intel- > gfx@lists.freedesktop.org; linux-kernel@vger.kernel.org; > jani.nikula@linux.intel.com; chris@chris-wilson.co.uk; Ertman, David M > <david.m.ertman@intel.com>; intel-wired-lan@lists.osuosl.org; dri- > devel@lists.freedesktop.org; netdev@vger.kernel.org; airlied@gmail.com > Subject: Re: [Intel-wired-lan] [PATCH v2 1/1] e1000e: Undo > e1000e_pm_freeze if __e1000_shutdown fails > > On Fri, 2017-06-02 at 14:14 -0400, David Miller wrote: > > From: Jani Nikula <jani.nikula@intel.com> > > Date: Wed, 31 May 2017 18:50:43 +0300 > > > > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > > > An error during suspend (e100e_pm_suspend), > > > > ... > > > lead to complete failure: > > > > ... > > > The unwind failures stems from commit 2800209994f8 ("e1000e: > > > Refactor PM > > > flows"), but it may be a later patch that introduced the non- > > > recoverable > > > behaviour. > > > > > > Fixes: 2800209994f8 ("e1000e: Refactor PM flows") > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > > Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > > > Cc: Dave Ertman <davidx.m.ertman@intel.com> > > > Cc: Bruce Allan <bruce.w.allan@intel.com> > > > Cc: intel-wired-lan@lists.osuosl.org > > > Cc: netdev@vger.kernel.org > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > [Jani: bikeshed repainted] > > > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > > > > Jeff, please make sure this gets submitted to me soon. > > Expect it later tonight, just finishing up testing. Tested-by: Aaron Brown <aaron.f.brown@intel.com>
On Wed, Jun 07, 2017 at 01:07:33AM +0000, Brown, Aaron F wrote: > > From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On Behalf > > Of Jeff Kirsher > > Sent: Tuesday, June 6, 2017 1:46 PM > > To: David Miller <davem@davemloft.net>; Nikula, Jani > > <jani.nikula@intel.com> > > Cc: Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; daniel.vetter@ffwll.ch; intel- > > gfx@lists.freedesktop.org; linux-kernel@vger.kernel.org; > > jani.nikula@linux.intel.com; chris@chris-wilson.co.uk; Ertman, David M > > <david.m.ertman@intel.com>; intel-wired-lan@lists.osuosl.org; dri- > > devel@lists.freedesktop.org; netdev@vger.kernel.org; airlied@gmail.com > > Subject: Re: [Intel-wired-lan] [PATCH v2 1/1] e1000e: Undo > > e1000e_pm_freeze if __e1000_shutdown fails > > > > On Fri, 2017-06-02 at 14:14 -0400, David Miller wrote: > > > From: Jani Nikula <jani.nikula@intel.com> > > > Date: Wed, 31 May 2017 18:50:43 +0300 > > > > > > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > > > > > An error during suspend (e100e_pm_suspend), > > > > > > ... > > > > lead to complete failure: > > > > > > ... > > > > The unwind failures stems from commit 2800209994f8 ("e1000e: > > > > Refactor PM > > > > flows"), but it may be a later patch that introduced the non- > > > > recoverable > > > > behaviour. > > > > > > > > Fixes: 2800209994f8 ("e1000e: Refactor PM flows") > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > > > Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > > > > Cc: Dave Ertman <davidx.m.ertman@intel.com> > > > > Cc: Bruce Allan <bruce.w.allan@intel.com> > > > > Cc: intel-wired-lan@lists.osuosl.org > > > > Cc: netdev@vger.kernel.org > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > > [Jani: bikeshed repainted] > > > > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > > > > > > Jeff, please make sure this gets submitted to me soon. > > > > Expect it later tonight, just finishing up testing. > > Tested-by: Aaron Brown <aaron.f.brown@intel.com> Hm, I seem to be blind, but I can't find it anywhere in -rc6. Does someone have the sha1 from Linus' git for this patch? Thanks, Daniel
On 20 June 2017 at 18:49, Daniel Vetter <daniel@ffwll.ch> wrote: > On Wed, Jun 07, 2017 at 01:07:33AM +0000, Brown, Aaron F wrote: >> > From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On Behalf >> > Of Jeff Kirsher >> > Sent: Tuesday, June 6, 2017 1:46 PM >> > To: David Miller <davem@davemloft.net>; Nikula, Jani >> > <jani.nikula@intel.com> >> > Cc: Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; daniel.vetter@ffwll.ch; intel- >> > gfx@lists.freedesktop.org; linux-kernel@vger.kernel.org; >> > jani.nikula@linux.intel.com; chris@chris-wilson.co.uk; Ertman, David M >> > <david.m.ertman@intel.com>; intel-wired-lan@lists.osuosl.org; dri- >> > devel@lists.freedesktop.org; netdev@vger.kernel.org; airlied@gmail.com >> > Subject: Re: [Intel-wired-lan] [PATCH v2 1/1] e1000e: Undo >> > e1000e_pm_freeze if __e1000_shutdown fails >> > >> > On Fri, 2017-06-02 at 14:14 -0400, David Miller wrote: >> > > From: Jani Nikula <jani.nikula@intel.com> >> > > Date: Wed, 31 May 2017 18:50:43 +0300 >> > > >> > > > From: Chris Wilson <chris@chris-wilson.co.uk> >> > > > >> > > > An error during suspend (e100e_pm_suspend), >> > > >> > > ... >> > > > lead to complete failure: >> > > >> > > ... >> > > > The unwind failures stems from commit 2800209994f8 ("e1000e: >> > > > Refactor PM >> > > > flows"), but it may be a later patch that introduced the non- >> > > > recoverable >> > > > behaviour. >> > > > >> > > > Fixes: 2800209994f8 ("e1000e: Refactor PM flows") >> > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 >> > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> > > > Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> >> > > > Cc: Dave Ertman <davidx.m.ertman@intel.com> >> > > > Cc: Bruce Allan <bruce.w.allan@intel.com> >> > > > Cc: intel-wired-lan@lists.osuosl.org >> > > > Cc: netdev@vger.kernel.org >> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> > > > [Jani: bikeshed repainted] >> > > > Signed-off-by: Jani Nikula <jani.nikula@intel.com> >> > > >> > > Jeff, please make sure this gets submitted to me soon. >> > >> > Expect it later tonight, just finishing up testing. >> >> Tested-by: Aaron Brown <aaron.f.brown@intel.com> > > Hm, I seem to be blind, but I can't find it anywhere in -rc6. Does someone > have the sha1 from Linus' git for this patch? Guys this is a pretty serious regression, just left blowing in the wind, is anyone responsible for e1000e? Dave.
On Wed, 2017-06-28 at 05:28 +1000, Dave Airlie wrote: > On 20 June 2017 at 18:49, Daniel Vetter <daniel@ffwll.ch> wrote: > > On Wed, Jun 07, 2017 at 01:07:33AM +0000, Brown, Aaron F wrote: > > > > From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] > > > > On Behalf > > > > Of Jeff Kirsher > > > > Sent: Tuesday, June 6, 2017 1:46 PM > > > > To: David Miller <davem@davemloft.net>; Nikula, Jani > > > > <jani.nikula@intel.com> > > > > Cc: Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; daniel.vetter@ffwll > > > > .ch; intel- > > > > gfx@lists.freedesktop.org; linux-kernel@vger.kernel.org; > > > > jani.nikula@linux.intel.com; chris@chris-wilson.co.uk; Ertman, > > > > David M > > > > <david.m.ertman@intel.com>; intel-wired-lan@lists.osuosl.org; dri- > > > > devel@lists.freedesktop.org; netdev@vger.kernel.org; airlied@gmail. > > > > com > > > > Subject: Re: [Intel-wired-lan] [PATCH v2 1/1] e1000e: Undo > > > > e1000e_pm_freeze if __e1000_shutdown fails > > > > > > > > On Fri, 2017-06-02 at 14:14 -0400, David Miller wrote: > > > > > From: Jani Nikula <jani.nikula@intel.com> > > > > > Date: Wed, 31 May 2017 18:50:43 +0300 > > > > > > > > > > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > > > > > > > > > An error during suspend (e100e_pm_suspend), > > > > > > > > > > ... > > > > > > lead to complete failure: > > > > > > > > > > ... > > > > > > The unwind failures stems from commit 2800209994f8 ("e1000e: > > > > > > Refactor PM > > > > > > flows"), but it may be a later patch that introduced the non- > > > > > > recoverable > > > > > > behaviour. > > > > > > > > > > > > Fixes: 2800209994f8 ("e1000e: Refactor PM flows") > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99847 > > > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > > > > > Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> > > > > > > Cc: Dave Ertman <davidx.m.ertman@intel.com> > > > > > > Cc: Bruce Allan <bruce.w.allan@intel.com> > > > > > > Cc: intel-wired-lan@lists.osuosl.org > > > > > > Cc: netdev@vger.kernel.org > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > > > > [Jani: bikeshed repainted] > > > > > > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > > > > > > > > > > Jeff, please make sure this gets submitted to me soon. > > > > > > > > Expect it later tonight, just finishing up testing. > > > > > > Tested-by: Aaron Brown <aaron.f.brown@intel.com> > > > > Hm, I seem to be blind, but I can't find it anywhere in -rc6. Does > > someone > > have the sha1 from Linus' git for this patch? > > Guys this is a pretty serious regression, just left blowing in the > wind, is anyone responsible for e1000e? This was submitted and accepted into David Miller's net-next tree. I can see if Dave can pull it into his net tree. DOes stable need to pick this up as well?
On Tue, Jun 27, 2017 at 10:51 PM, Jeff Kirsher <jeffrey.t.kirsher@intel.com> wrote: > This was submitted and accepted into David Miller's net-next tree. I can > see if Dave can pull it into his net tree. DOes stable need to pick this > up as well? Nah if it landed somewhere at least I'm happy, we can carry the fixup for a while longer locally. Thanks, Daniel
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index b3679728caac..5cad688be609 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -6630,12 +6630,17 @@ static int e1000e_pm_thaw(struct device *dev) static int e1000e_pm_suspend(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); + int rc; e1000e_flush_lpic(pdev); e1000e_pm_freeze(dev); - return __e1000_shutdown(pdev, false); + rc = __e1000_shutdown(pdev, false); + if (rc) + e1000e_pm_thaw(dev); + + return rc; } static int e1000e_pm_resume(struct device *dev)