[0/8] dm: request-based dm-multipath

Message ID	49B613FE.3060501@suse.de (mailing list archive)
State	Superseded, archived
Headers	show Received: from hormel.redhat.com (hormel1.redhat.com [209.132.177.33]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n2A7Hbf2024287 for <patchwork-dm-devel@patchwork.kernel.org>; Tue, 10 Mar 2009 07:17:38 GMT Message-ID: <49B613FE.3060501@suse.de> Date: Tue, 10 Mar 2009 08:17:18 +0100 From: Hannes Reinecke <hare@suse.de> User-Agent: Thunderbird 2.0.0.12 (X11/20080226) MIME-Version: 1.0 To: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Subject: Re: [dm-devel] [PATCH 0/8] dm: request-based dm-multipath References: <20081003.110825.74754936.k-ueda@ct.jp.nec.com> <20090128154019.GB23158@agk.fab.redhat.com> <49815863.8040806@ct.jp.nec.com> <20090129104147.GB9870@pentland.suse.de> <4982B4C6.8050904@ct.jp.nec.com> <49B60444.2090008@ct.jp.nec.com> In-Reply-To: <49B60444.2090008@ct.jp.nec.com> Content-Type: multipart/mixed; boundary="------------060004040507090501040304" Cc: device-mapper development <dm-devel@redhat.com> Precedence: junk Reply-To: device-mapper development <dm-devel@redhat.com> Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com

Message ID

49B613FE.3060501@suse.de (mailing list archive)

State

Superseded, archived

Headers

Message-ID: <49B613FE.3060501@suse.de>
Date: Tue, 10 Mar 2009 08:17:18 +0100
From: Hannes Reinecke <hare@suse.de>
User-Agent: Thunderbird 2.0.0.12 (X11/20080226)
MIME-Version: 1.0
To: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Subject: Re: [dm-devel] [PATCH 0/8] dm: request-based dm-multipath
References: <20081003.110825.74754936.k-ueda@ct.jp.nec.com>	<20090128154019.GB23158@agk.fab.redhat.com>	<49815863.8040806@ct.jp.nec.com>	<20090129104147.GB9870@pentland.suse.de>	<4982B4C6.8050904@ct.jp.nec.com>
	<49B60444.2090008@ct.jp.nec.com>
In-Reply-To: <49B60444.2090008@ct.jp.nec.com>
Content-Type: multipart/mixed;
	boundary="------------060004040507090501040304"
Cc: device-mapper development <dm-devel@redhat.com>
Precedence: junk
Reply-To: device-mapper development <dm-devel@redhat.com>
Sender: dm-devel-bounces@redhat.com
Errors-To: dm-devel-bounces@redhat.com

Commit Message

Hannes Reinecke March 10, 2009, 7:17 a.m. UTC

Hi Kiyoshi,

Kiyoshi Ueda wrote:
> Hi Hannes,
> 
> On 2009/01/30 17:05 +0900, Kiyoshi Ueda wrote:
>>>>   o kernel panic occurs by frequent table swapping during heavy I/Os.
>>>>  
>>> That's probably fixed by this patch:
>>>
>>> --- linux-2.6.27/drivers/md/dm.c.orig   2009-01-23 15:59:22.741461315 +0100
>>> +++ linux-2.6.27/drivers/md/dm.c        2009-01-26 09:03:02.787605723 +0100
>>> @@ -714,13 +714,14 @@ static void free_bio_clone(struct reques
>>>         struct dm_rq_target_io *tio = clone->end_io_data;
>>>         struct mapped_device *md = tio->md;
>>>         struct bio *bio;
>>> -       struct dm_clone_bio_info *info;
>>>  
>>>         while ((bio = clone->bio) != NULL) {
>>>                 clone->bio = bio->bi_next;
>>>  
>>> -               info = bio->bi_private;
>>> -               free_bio_info(md, info);
>>> +               if (bio->bi_private) {
>>> +                       struct dm_clone_bio_info *info = bio->bi_private;
>>> +                       free_bio_info(md, info);
>>> +               }
>>>  
>>>                 bio->bi_private = md->bs;
>>>                 bio_put(bio);
>>>
>>> The info field is not necessarily filled here, so we have to check for it
>>> explicitly.
>>>
>>> With these two patches request-based multipathing have survived all stress-tests
>>> so far. Except on mainframe (zfcp), but that's more a driver-related thing.
> 
> My problem was different from that one, and I have fixed my problem.
> 
What was this? Was is something specific to your setup or some within the
request-based multipathing code?
If the latter, I'd be _very_ much interested in seeing the patch. Naturally.

> Do you hit some problem without the patch above?
> If so, that should be a programming bug and we need to fix it.  Otherwise,
> we should be leaking a memory (since all cloned bio should always have
> the dm_clone_bio_info structure in ->bi_private).
> 
Yes, I've found that one later on.
The real problem was in clone_setup_bios(), which might end up calling an
invalid end_io_data pointer. Patch is attached.

Cheers,

Hannes

Comments

Kiyoshi Ueda March 12, 2009, 8:58 a.m. UTC | #1

Hi Hannes,

On 2009/03/10 16:17 +0900, Hannes Reinecke wrote:
>>>>>   o kernel panic occurs by frequent table swapping during heavy I/Os.
>>>>>  
>>>> That's probably fixed by this patch:
>>>>
>>>> --- linux-2.6.27/drivers/md/dm.c.orig   2009-01-23
>>>> 15:59:22.741461315 +0100
>>>> +++ linux-2.6.27/drivers/md/dm.c        2009-01-26
>>>> 09:03:02.787605723 +0100
>>>> @@ -714,13 +714,14 @@ static void free_bio_clone(struct reques
>>>>         struct dm_rq_target_io *tio = clone->end_io_data;
>>>>         struct mapped_device *md = tio->md;
>>>>         struct bio *bio;
>>>> -       struct dm_clone_bio_info *info;
>>>>  
>>>>         while ((bio = clone->bio) != NULL) {
>>>>                 clone->bio = bio->bi_next;
>>>>  
>>>> -               info = bio->bi_private;
>>>> -               free_bio_info(md, info);
>>>> +               if (bio->bi_private) {
>>>> +                       struct dm_clone_bio_info *info =
>>>> bio->bi_private;
>>>> +                       free_bio_info(md, info);
>>>> +               }
>>>>  
>>>>                 bio->bi_private = md->bs;
>>>>                 bio_put(bio);
>>>>
>>>> The info field is not necessarily filled here, so we have to check
>>>> for it
>>>> explicitly.
>>>>
>>>> With these two patches request-based multipathing have survived all
>>>> stress-tests
>>>> so far. Except on mainframe (zfcp), but that's more a driver-related
>>>> thing.
>>
>> Do you hit some problem without the patch above?
>> If so, that should be a programming bug and we need to fix it. 
>> Otherwise,
>> we should be leaking a memory (since all cloned bio should always have
>> the dm_clone_bio_info structure in ->bi_private).
>>
> Yes, I've found that one later on.
> The real problem was in clone_setup_bios(), which might end up calling an
> invalid end_io_data pointer. Patch is attached.

Nice catch!  Thank you for the patch.

> -static void free_bio_clone(struct request *clone)
> +static void free_bio_clone(struct request *clone, struct mapped_device *md)

I have changed the argument order to match with other free_* functions:
    free_bio_clone(struct mapped_device *md, struct request *clone)

Thanks,
Kiyoshi Ueda

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Hannes Reinecke March 12, 2009, 9:08 a.m. UTC | #2

Hi Kiyoshi,

Kiyoshi Ueda wrote:
> Hi Hannes,
> 
> On 2009/03/10 16:17 +0900, Hannes Reinecke wrote:
[ .. ]
>> Yes, I've found that one later on.
>> The real problem was in clone_setup_bios(), which might end up calling an
>> invalid end_io_data pointer. Patch is attached.
> 
> Nice catch!  Thank you for the patch.
> 
Oh, nae bother. Took me only a month to track it down :-(

>> -static void free_bio_clone(struct request *clone)
>> +static void free_bio_clone(struct request *clone, struct mapped_device *md)
> 
> I have changed the argument order to match with other free_* functions:
>     free_bio_clone(struct mapped_device *md, struct request *clone)
> 
Sure. I wasn't sure myself which way round the arguments should be.

Do you have an updated patch of your suspend fixes? We've run into an issue
here which looks suspiciously close to that one (I/O is completed on a deleted
pgpath), so we would be happy to test it these out.

Cheers,

Hannes

Kiyoshi Ueda March 13, 2009, 1:03 a.m. UTC | #3

Hi Hannes,

On 2009/03/12 18:08 +0900, Hannes Reinecke wrote:
> Do you have an updated patch of your suspend fixes? We've run into an issue
> here which looks suspiciously close to that one (I/O is completed on a
> deleted pgpath), so we would be happy to test it these out.

You mean that the issue occurs WITHOUT the suspend fix patch which
I sent.  Is it right?
If so, you can use it, since I haven't added any big change about
suspend fix since then.
Logic changes I added about suspend fix since then are only in
rq_complete() to follow your comment.  The updated rq_complete()
is below:

static void rq_completed(struct mapped_device *md)
{
	struct request_queue *q = md->queue;
	unsigned long flags;

	spin_lock_irqsave(q->queue_lock, flags);
	if (q->in_flight) {
		spin_unlock_irqrestore(q->queue_lock, flags);
		return;
	}
	spin_unlock_irqrestore(q->queue_lock, flags);

	/* nudge anyone waiting on suspend queue */
	wake_up(&md->wait);
}

I merged the previous suspend fix patch into the request-based dm
core patch, and I've been changing the core patch after that.
So I don't have a patch which addresses only suspend fix update.
Sorry about that.

Thanks,
Kiyoshi Ueda

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

--- linux-2.6.27-SLE11_BRANCH/drivers/md/dm.c.orig	2009-02-04 10:33:22.656627650 +0100
+++ linux-2.6.27-SLE11_BRANCH/drivers/md/dm.c	2009-02-05 11:03:35.843251773 +0100
@@ -709,10 +709,8 @@  static void end_clone_bio(struct bio *cl
 	blk_update_request(tio->orig, 0, nr_bytes);
 }
 
-static void free_bio_clone(struct request *clone)
+static void free_bio_clone(struct request *clone, struct mapped_device *md)
 {
-	struct dm_rq_target_io *tio = clone->end_io_data;
-	struct mapped_device *md = tio->md;
 	struct bio *bio;
 
 	while ((bio = clone->bio) != NULL) {
@@ -743,7 +741,7 @@  static void dm_unprep_request(struct req
 	rq->special = NULL;
 	rq->cmd_flags &= ~REQ_DONTPREP;
 
-	free_bio_clone(clone);
+	free_bio_clone(clone, tio->md);
 	dec_rq_pending(tio);
 	free_rq_tio(tio->md, tio);
 }
@@ -820,7 +818,7 @@  static void dm_end_request(struct reques
 			rq->sense_len = clone->sense_len;
 	}
 
-	free_bio_clone(clone);
+	free_bio_clone(clone, tio->md);
 	dec_rq_pending(tio);
 	free_rq_tio(tio->md, tio);
 
@@ -1406,7 +1404,7 @@  static int clone_request_bios(struct req
 	return 0;
 
 free_and_out:
-	free_bio_clone(clone);
+	free_bio_clone(clone, md);
 
 	return -ENOMEM;
 }