diff mbox

Re: Oops using 2.6.28.n after a lazy umount of a crypted loop-device

Message ID 49AFBE4A.1010605@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Alasdair Kergon
Headers show

Commit Message

Milan Broz March 5, 2009, 11:58 a.m. UTC
Alexander Holler wrote:

> Hello,
>
> for some reason I can't remember I've done a lazy umount follwing the
> deregistration of the loop-device. The commands in question are:
>
> ---------
> umount -l /mnt/crypted
> cryptsetup luksClose crypted
> losetup -d /dev/loop1
> ---------
>
> Using Kernels 2.6.28.2 and .7 this two times resulted
> in an Oops like the following (both having the same Call Trace):
>
>   
Please Can you try attached patch if helps here?
(Patch is not perfect, but should help, at least identify that
it is the same problem I am fixing:-)

Milan
--
mbroz@redhat.com

dm crypt: Wait for possible unfinished endio() call in destructor

When user set dm-crypt over loop device and the loop thread processing
bios calls bio_endio later than the dm-crypt mapping is destroyed
(including mempool for dm io request), the endio can cause this OOPs:

(mempool_free from already destroyed mempool).

[   70.381058] EIP is at mempool_free+0x12/0x6b
...
[   70.381058] Process loop0 (pid: 4268, ti=cf3b2000 task=cf1cc1f0 task.ti=cf3b2000)
...
[   70.381058] Call Trace:
[   70.381058]  [<d0d76601>] ? crypt_dec_pending+0x5e/0x62 [dm_crypt]
[   70.381058]  [<d0d767b8>] ? crypt_endio+0xa2/0xaa [dm_crypt]
[   70.381058]  [<d0d76716>] ? crypt_endio+0x0/0xaa [dm_crypt]
[   70.381058]  [<c01a2f24>] ? bio_endio+0x2b/0x2e
[   70.381058]  [<d0806530>] ? dec_pending+0x224/0x23b [dm_mod]
[   70.381058]  [<d08066e4>] ? clone_endio+0x79/0xa4 [dm_mod]
[   70.381058]  [<d080666b>] ? clone_endio+0x0/0xa4 [dm_mod]
[   70.381058]  [<c01a2f24>] ? bio_endio+0x2b/0x2e
[   70.381058]  [<c02bad86>] ? loop_thread+0x380/0x3b7
[   70.381058]  [<c02ba8a1>] ? do_lo_send_aops+0x0/0x165
[   70.381058]  [<c013754f>] ? autoremove_wake_function+0x0/0x33
[   70.381058]  [<c02baa06>] ? loop_thread+0x0/0x3b7

Fix it by adding reference counter into crypt config and wait till
all endio operations finishes.

Signed-off-by: Milan Broz <mbroz@redhat.com>
---
 drivers/md/dm-crypt.c |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Comments

Milan Broz March 6, 2009, 8:24 a.m. UTC | #1
Alexander Holler wrote:
> Milan Broz schrieb:
>> Please Can you try attached patch if helps here?
>> (Patch is not perfect, but should help, at least identify that
>> it is the same problem I am fixing:-)
> 
> The patch works (I had to add an #include <linux/delay.h> /* msleep */).

ok, thanks.


> I've tested it using 2.6.28.7 and the script below. With your patch the 
> script was running over night looping about 400 times without any error. 
> A crosscheck without the patch needed only 10 iterations to get an oops. 
> So I assume you have fixed the problem I had. ;)
> 
> There stills seems to be another problem left, I've got 3 times the 
> kernel-message
> 
> device-mapper: ioctl: unable to remove open device 
> temporary-cryptsetup-21571

This is bug in cryptsetup related to udev (udev should not touch temporary
cryptsetup device) which is fixed upstream, but there is no
new release yet (Fedora has this patched, not sure about other distros)

see http://code.google.com/p/cryptsetup/source/detail?r=32 and other patches.

Milan
--
mbroz@redhat.com

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 35bda49..fa37c87 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -95,6 +95,8 @@  struct crypt_config {
 	struct workqueue_struct *io_queue;
 	struct workqueue_struct *crypt_queue;
 
+	atomic_t pending;
+
 	/*
 	 * crypto related data
 	 */
@@ -566,6 +568,7 @@  static void crypt_dec_pending(struct dm_crypt_io *io)
 	}
 
 	mempool_free(io, cc->io_pool);
+	atomic_dec(&cc->pending);
 }
 
 /*
@@ -1113,6 +1116,8 @@  static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		goto bad_crypt_queue;
 	}
 
+	atomic_set(&cc->pending, 0);
+
 	ti->private = cc;
 	return 0;
 
@@ -1149,6 +1154,9 @@  static void crypt_dtr(struct dm_target *ti)
 	destroy_workqueue(cc->io_queue);
 	destroy_workqueue(cc->crypt_queue);
 
+	while (atomic_read(&cc->pending))
+		msleep(1);
+
 	if (cc->req)
 		mempool_free(cc->req, cc->req_pool);
 
@@ -1171,8 +1179,11 @@  static void crypt_dtr(struct dm_target *ti)
 static int crypt_map(struct dm_target *ti, struct bio *bio,
 		     union map_info *map_context)
 {
+	struct crypt_config *cc = ti->private;
 	struct dm_crypt_io *io;
 
+	atomic_inc(&cc->pending);
+
 	io = crypt_io_alloc(ti, bio, bio->bi_sector - ti->begin);
 
 	if (bio_data_dir(io->base_bio) == READ)