diff mbox

[ext4] e2ae766c1b: BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c

Message ID 20161212163518.GA20728@quack2.suse.cz (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Kara Dec. 12, 2016, 4:35 p.m. UTC
On Mon 12-12-16 18:13:21, kernel test robot wrote:
> FYI, we noticed the following commit:
> 
> commit: e2ae766c1b030271b5099b25674e2131d1d1e8c1 ("ext4: convert DAX faults to iomap infrastructure")
> https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
> 
> in testcase: nvml
> with following parameters:
> 
> 	group: vmem
> 	test: pmem
> 	nr_pmem: 1
> 	fs: ext4
> 	mount_option: dax
> 
> 
> 
> on test machine: 64 threads Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz with 64G memory
> 
> caused below changes:
> 
> 
> +------------------------------------------------+------------+------------+
> |                                                | 96f8ba3dd6 | e2ae766c1b |
> +------------------------------------------------+------------+------------+
> | boot_successes                                 | 2          | 2          |
> | boot_failures                                  | 2          | 2          |
> | BUG:kernel_hang_in_test_stage                  | 2          |            |
> | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup      | 0          | 2          |
> | calltrace:parport_pc_init                      | 0          | 2          |
> | calltrace:SyS_finit_module                     | 0          | 2          |
> | WARNING:at_lib/kobject.c:#kobject_add_internal | 0          | 2          |
> +------------------------------------------------+------------+------------+
> 
> 
> 
> user  :notice: [  325.592182] vmem_aligned_alloc/TEST1: SETUP (check/pmem/debug)
> 
> user  :notice: [  325.603973] vmem_aligned_alloc/TEST1: START: vmem_aligned_alloc
> kern  :err   : [  325.608906] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:51
> kern  :err   : [  325.608908] in_atomic(): 1, irqs_disabled(): 0, pid: 24813, name: vmem_aligned_al
> kern  :warn  : [  325.608914] CPU: 44 PID: 24813 Comm: vmem_aligned_al Tainted: G           O    4.9.0-rc4-00045-ge2ae766 #1
> kern  :warn  : [  325.608916] Hardware name: Intel Corporation LH Pass/S4600LH...., BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012
> kern  :warn  : [  325.608922]  ffffc9002c1f7be0
> kern  :warn  : [  325.608923]  ffffffff81466af9
> kern  :warn  : [  325.608924]  ffff880fea2425c0

I think this is actually a bug introduced by Ross' PMD support. Attached
patch should fix it. Ross, can you check it please?

								Honza

Comments

Ross Zwisler Dec. 12, 2016, 10:13 p.m. UTC | #1
On Mon, Dec 12, 2016 at 05:35:18PM +0100, Jan Kara wrote:
> On Mon 12-12-16 18:13:21, kernel test robot wrote:
> > FYI, we noticed the following commit:
> > 
> > commit: e2ae766c1b030271b5099b25674e2131d1d1e8c1 ("ext4: convert DAX faults to iomap infrastructure")
> > https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
> > 
> > in testcase: nvml
> > with following parameters:
> > 
> > 	group: vmem
> > 	test: pmem
> > 	nr_pmem: 1
> > 	fs: ext4
> > 	mount_option: dax
> > 
> > 
> > 
> > on test machine: 64 threads Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz with 64G memory
> > 
> > caused below changes:
> > 
> > 
> > +------------------------------------------------+------------+------------+
> > |                                                | 96f8ba3dd6 | e2ae766c1b |
> > +------------------------------------------------+------------+------------+
> > | boot_successes                                 | 2          | 2          |
> > | boot_failures                                  | 2          | 2          |
> > | BUG:kernel_hang_in_test_stage                  | 2          |            |
> > | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup      | 0          | 2          |
> > | calltrace:parport_pc_init                      | 0          | 2          |
> > | calltrace:SyS_finit_module                     | 0          | 2          |
> > | WARNING:at_lib/kobject.c:#kobject_add_internal | 0          | 2          |
> > +------------------------------------------------+------------+------------+
> > 
> > 
> > 
> > user  :notice: [  325.592182] vmem_aligned_alloc/TEST1: SETUP (check/pmem/debug)
> > 
> > user  :notice: [  325.603973] vmem_aligned_alloc/TEST1: START: vmem_aligned_alloc
> > kern  :err   : [  325.608906] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:51
> > kern  :err   : [  325.608908] in_atomic(): 1, irqs_disabled(): 0, pid: 24813, name: vmem_aligned_al
> > kern  :warn  : [  325.608914] CPU: 44 PID: 24813 Comm: vmem_aligned_al Tainted: G           O    4.9.0-rc4-00045-ge2ae766 #1
> > kern  :warn  : [  325.608916] Hardware name: Intel Corporation LH Pass/S4600LH...., BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012
> > kern  :warn  : [  325.608922]  ffffc9002c1f7be0
> > kern  :warn  : [  325.608923]  ffffffff81466af9
> > kern  :warn  : [  325.608924]  ffff880fea2425c0
> 
> I think this is actually a bug introduced by Ross' PMD support. Attached
> patch should fix it. Ross, can you check it please?

Yep, that patch looks good.   You can add:

Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>

And I turned on CONFIG_DEBUG_ATOMIC_SLEEP in my test config.  :)
Theodore Ts'o Dec. 12, 2016, 10:37 p.m. UTC | #2
Is this problem likely to happen in other file systems?  Should I take
this path through the ext4 tree, or would it be better via some other
git tree?

					- Ted
Ross Zwisler Dec. 12, 2016, 10:48 p.m. UTC | #3
On Mon, Dec 12, 2016 at 05:37:36PM -0500, Theodore Ts'o wrote:
> Is this problem likely to happen in other file systems?  Should I take
> this path through the ext4 tree, or would it be better via some other
> git tree?
> 
> 					- Ted

The problem is in the generic DAX code and affects ext4 and xfs equally (ext2
doesn't support PMDs).
Theodore Ts'o Dec. 12, 2016, 11 p.m. UTC | #4
On Mon, Dec 12, 2016 at 03:48:51PM -0700, Ross Zwisler wrote:
> On Mon, Dec 12, 2016 at 05:37:36PM -0500, Theodore Ts'o wrote:
> > Is this problem likely to happen in other file systems?  Should I take
> > this path through the ext4 tree, or would it be better via some other
> > git tree?
> > 
> > 					- Ted
> 
> The problem is in the generic DAX code and affects ext4 and xfs equally (ext2
> doesn't support PMDs).

Any preferences about how to send this patch to Linus?  This issue is
the only thing that was causing me to hold off on sending a pull
request to Linus....

						- Ted
Ross Zwisler Dec. 12, 2016, 11:13 p.m. UTC | #5
On Mon, Dec 12, 2016 at 06:00:20PM -0500, Theodore Ts'o wrote:
> On Mon, Dec 12, 2016 at 03:48:51PM -0700, Ross Zwisler wrote:
> > On Mon, Dec 12, 2016 at 05:37:36PM -0500, Theodore Ts'o wrote:
> > > Is this problem likely to happen in other file systems?  Should I take
> > > this path through the ext4 tree, or would it be better via some other
> > > git tree?
> > > 
> > > 					- Ted
> > 
> > The problem is in the generic DAX code and affects ext4 and xfs equally (ext2
> > doesn't support PMDs).
> 
> Any preferences about how to send this patch to Linus?  This issue is
> the only thing that was causing me to hold off on sending a pull
> request to Linus....

Personally I'm happy to have you send it.  Thanks!
Huang, Ying Dec. 13, 2016, 1:27 a.m. UTC | #6
Jan Kara <jack@suse.cz> writes:

> On Mon 12-12-16 18:13:21, kernel test robot wrote:
>> FYI, we noticed the following commit:
>> 
>> commit: e2ae766c1b030271b5099b25674e2131d1d1e8c1 ("ext4: convert DAX faults to iomap infrastructure")
>> https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
>> 
>> in testcase: nvml
>> with following parameters:
>> 
>> 	group: vmem
>> 	test: pmem
>> 	nr_pmem: 1
>> 	fs: ext4
>> 	mount_option: dax
>> 
>> 
>> 
>> on test machine: 64 threads Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz with 64G memory
>> 
>> caused below changes:
>> 
>> 
>> +------------------------------------------------+------------+------------+
>> |                                                | 96f8ba3dd6 | e2ae766c1b |
>> +------------------------------------------------+------------+------------+
>> | boot_successes                                 | 2          | 2          |
>> | boot_failures                                  | 2          | 2          |
>> | BUG:kernel_hang_in_test_stage                  | 2          |            |
>> | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup      | 0          | 2          |
>> | calltrace:parport_pc_init                      | 0          | 2          |
>> | calltrace:SyS_finit_module                     | 0          | 2          |
>> | WARNING:at_lib/kobject.c:#kobject_add_internal | 0          | 2          |
>> +------------------------------------------------+------------+------------+
>> 
>> 
>> 
>> user  :notice: [  325.592182] vmem_aligned_alloc/TEST1: SETUP (check/pmem/debug)
>> 
>> user  :notice: [  325.603973] vmem_aligned_alloc/TEST1: START: vmem_aligned_alloc
>> kern  :err   : [  325.608906] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:51
>> kern  :err   : [  325.608908] in_atomic(): 1, irqs_disabled(): 0, pid: 24813, name: vmem_aligned_al
>> kern  :warn  : [  325.608914] CPU: 44 PID: 24813 Comm: vmem_aligned_al Tainted: G           O    4.9.0-rc4-00045-ge2ae766 #1
>> kern  :warn  : [  325.608916] Hardware name: Intel Corporation LH Pass/S4600LH...., BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012
>> kern  :warn  : [  325.608922]  ffffc9002c1f7be0
>> kern  :warn  : [  325.608923]  ffffffff81466af9
>> kern  :warn  : [  325.608924]  ffff880fea2425c0
>
> I think this is actually a bug introduced by Ross' PMD support. Attached
> patch should fix it. Ross, can you check it please?

Hi, Jan

Could you provide a git tree commit for me to test it?  If you want it
to be tested by 0day.

Best Regards,
Huang, Ying

> 								Honza
Jan Kara Dec. 13, 2016, 8:47 a.m. UTC | #7
On Mon 12-12-16 18:00:20, Ted Tso wrote:
> On Mon, Dec 12, 2016 at 03:48:51PM -0700, Ross Zwisler wrote:
> > On Mon, Dec 12, 2016 at 05:37:36PM -0500, Theodore Ts'o wrote:
> > > Is this problem likely to happen in other file systems?  Should I take
> > > this path through the ext4 tree, or would it be better via some other
> > > git tree?
> > > 
> > > 					- Ted
> > 
> > The problem is in the generic DAX code and affects ext4 and xfs equally (ext2
> > doesn't support PMDs).
> 
> Any preferences about how to send this patch to Linus?  This issue is
> the only thing that was causing me to hold off on sending a pull
> request to Linus....

Yeah, just take it unless Dave already did.

								Honza
Jan Kara Dec. 13, 2016, 11:42 a.m. UTC | #8
On Tue 13-12-16 09:27:51, Huang, Ying wrote:
> Jan Kara <jack@suse.cz> writes:
> 
> > On Mon 12-12-16 18:13:21, kernel test robot wrote:
> >> FYI, we noticed the following commit:
> >> 
> >> commit: e2ae766c1b030271b5099b25674e2131d1d1e8c1 ("ext4: convert DAX faults to iomap infrastructure")
> >> https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
> >> 
> >> in testcase: nvml
> >> with following parameters:
> >> 
> >> 	group: vmem
> >> 	test: pmem
> >> 	nr_pmem: 1
> >> 	fs: ext4
> >> 	mount_option: dax
> >> 
> >> 
> >> 
> >> on test machine: 64 threads Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz with 64G memory
> >> 
> >> caused below changes:
> >> 
> >> 
> >> +------------------------------------------------+------------+------------+
> >> |                                                | 96f8ba3dd6 | e2ae766c1b |
> >> +------------------------------------------------+------------+------------+
> >> | boot_successes                                 | 2          | 2          |
> >> | boot_failures                                  | 2          | 2          |
> >> | BUG:kernel_hang_in_test_stage                  | 2          |            |
> >> | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup      | 0          | 2          |
> >> | calltrace:parport_pc_init                      | 0          | 2          |
> >> | calltrace:SyS_finit_module                     | 0          | 2          |
> >> | WARNING:at_lib/kobject.c:#kobject_add_internal | 0          | 2          |
> >> +------------------------------------------------+------------+------------+
> >> 
> >> 
> >> 
> >> user  :notice: [  325.592182] vmem_aligned_alloc/TEST1: SETUP (check/pmem/debug)
> >> 
> >> user  :notice: [  325.603973] vmem_aligned_alloc/TEST1: START: vmem_aligned_alloc
> >> kern  :err   : [  325.608906] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:51
> >> kern  :err   : [  325.608908] in_atomic(): 1, irqs_disabled(): 0, pid: 24813, name: vmem_aligned_al
> >> kern  :warn  : [  325.608914] CPU: 44 PID: 24813 Comm: vmem_aligned_al Tainted: G           O    4.9.0-rc4-00045-ge2ae766 #1
> >> kern  :warn  : [  325.608916] Hardware name: Intel Corporation LH Pass/S4600LH...., BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012
> >> kern  :warn  : [  325.608922]  ffffc9002c1f7be0
> >> kern  :warn  : [  325.608923]  ffffffff81466af9
> >> kern  :warn  : [  325.608924]  ffff880fea2425c0
> >
> > I think this is actually a bug introduced by Ross' PMD support. Attached
> > patch should fix it. Ross, can you check it please?
> 
> Hi, Jan
> 
> Could you provide a git tree commit for me to test it?  If you want it
> to be tested by 0day.

Thanks for the offer! I've pushed out the latest version of my DAX patches
including the above fix to

git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git dax

								Honza
Huang, Ying Dec. 14, 2016, 3:11 a.m. UTC | #9
Jan Kara <jack@suse.cz> writes:

> On Tue 13-12-16 09:27:51, Huang, Ying wrote:
>> Jan Kara <jack@suse.cz> writes:
>> 
>> > On Mon 12-12-16 18:13:21, kernel test robot wrote:
>> >> FYI, we noticed the following commit:
>> >> 
>> >> commit: e2ae766c1b030271b5099b25674e2131d1d1e8c1 ("ext4: convert DAX faults to iomap infrastructure")
>> >> https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
>> >> 
>> >> in testcase: nvml
>> >> with following parameters:
>> >> 
>> >> 	group: vmem
>> >> 	test: pmem
>> >> 	nr_pmem: 1
>> >> 	fs: ext4
>> >> 	mount_option: dax
>> >> 
>> >> 
>> >> 
>> >> on test machine: 64 threads Intel(R) Xeon(R) CPU E5-4650 0 @ 2.70GHz with 64G memory
>> >> 
>> >> caused below changes:
>> >> 
>> >> 
>> >> +------------------------------------------------+------------+------------+
>> >> |                                                | 96f8ba3dd6 | e2ae766c1b |
>> >> +------------------------------------------------+------------+------------+
>> >> | boot_successes                                 | 2          | 2          |
>> >> | boot_failures                                  | 2          | 2          |
>> >> | BUG:kernel_hang_in_test_stage                  | 2          |            |
>> >> | WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup      | 0          | 2          |
>> >> | calltrace:parport_pc_init                      | 0          | 2          |
>> >> | calltrace:SyS_finit_module                     | 0          | 2          |
>> >> | WARNING:at_lib/kobject.c:#kobject_add_internal | 0          | 2          |
>> >> +------------------------------------------------+------------+------------+
>> >> 
>> >> 
>> >> 
>> >> user  :notice: [  325.592182] vmem_aligned_alloc/TEST1: SETUP (check/pmem/debug)
>> >> 
>> >> user  :notice: [  325.603973] vmem_aligned_alloc/TEST1: START: vmem_aligned_alloc
>> >> kern  :err   : [  325.608906] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:51
>> >> kern  :err   : [  325.608908] in_atomic(): 1, irqs_disabled(): 0, pid: 24813, name: vmem_aligned_al
>> >> kern  :warn  : [  325.608914] CPU: 44 PID: 24813 Comm: vmem_aligned_al Tainted: G           O    4.9.0-rc4-00045-ge2ae766 #1
>> >> kern  :warn  : [  325.608916] Hardware name: Intel Corporation LH Pass/S4600LH...., BIOS SE5C600.86B.99.02.1047.032320122259 03/23/2012
>> >> kern  :warn  : [  325.608922]  ffffc9002c1f7be0
>> >> kern  :warn  : [  325.608923]  ffffffff81466af9
>> >> kern  :warn  : [  325.608924]  ffff880fea2425c0
>> >
>> > I think this is actually a bug introduced by Ross' PMD support. Attached
>> > patch should fix it. Ross, can you check it please?
>> 
>> Hi, Jan
>> 
>> Could you provide a git tree commit for me to test it?  If you want it
>> to be tested by 0day.
>
> Thanks for the offer! I've pushed out the latest version of my DAX patches
> including the above fix to
>
> git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git dax

The test result of the head and its parent is as follow,

=========================================================================================
compiler/fs/group/kconfig/mount_option/nr_pmem/rootfs/tbox_group/test/testcase:
  gcc-6/ext4/vmem/x86_64-rhel-7.2/dax/1/debian-x86_64-2016-08-31.cgz/lkp-sbx04/pmem/nvml

commit: 
  c6207e9b753f466bc2e41455dc7611869d439d4e
  4393b9bdd5043e550f1bcaf7f4e9c413d0088425

c6207e9b753f466b 4393b9bdd5043e550f1bcaf7f4 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          3:3         -100%            :3     dmesg.BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c
          3:3         -100%            :3     kmsg.in_atomic():#,irqs_disabled():#,pid:#,name:vmem_aligned_al
          3:3         -100%            :3     kmsg.in_atomic():#,irqs_disabled():#,pid:#,name:vmem_multiple_p

The bug has been fixed.  Feel free to add,

Tested-by: "Huang, Ying" <ying.huang@intel.com>

Best Regards,
Huang, Ying
diff mbox

Patch

From c3d67dc7543abc03161f6cf357039ad9e56783ca Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Mon, 12 Dec 2016 16:32:23 +0100
Subject: [PATCH] dax: Fix sleep in atomic contex in grab_mapping_entry()

Commit 7b5b8c9c4ac9 "dax: add struct iomap based DAX PMD support" has
introduced unmapping of page tables if huge page needs to be split in
grab_mapping_entry(). However the unmapping happens after
radix_tree_preload() call which disables preemption and thus
unmap_mapping_range() tries to acquire i_mmap_lock in atomic context
which is a bug. Fix the problem by moving unmapping before
radix_tree_preload() call.

Fixes: 7b5b8c9c4ac9716fe9d77ec56ae5d962192ba030
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/dax.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 51b03e91d3e2..5c74f60d0a50 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -351,14 +351,6 @@  static void *grab_mapping_entry(struct address_space *mapping, pgoff_t index,
 		}
 
 		spin_unlock_irq(&mapping->tree_lock);
-		err = radix_tree_preload(
-				mapping_gfp_mask(mapping) & ~__GFP_HIGHMEM);
-		if (err) {
-			if (pmd_downgrade)
-				put_locked_mapping_entry(mapping, index, entry);
-			return ERR_PTR(err);
-		}
-
 		/*
 		 * Besides huge zero pages the only other thing that gets
 		 * downgraded are empty entries which don't need to be
@@ -368,6 +360,13 @@  static void *grab_mapping_entry(struct address_space *mapping, pgoff_t index,
 			unmap_mapping_range(mapping,
 				(index << PAGE_SHIFT) & PMD_MASK, PMD_SIZE, 0);
 
+		err = radix_tree_preload(
+				mapping_gfp_mask(mapping) & ~__GFP_HIGHMEM);
+		if (err) {
+			if (pmd_downgrade)
+				put_locked_mapping_entry(mapping, index, entry);
+			return ERR_PTR(err);
+		}
 		spin_lock_irq(&mapping->tree_lock);
 
 		if (pmd_downgrade) {
-- 
2.10.2