diff mbox

fio rbd completions (Was: fio rbd hang for block sizes > 1M)

Message ID 544BF808.2090800@kernel.dk (mailing list archive)
State New, archived
Headers show

Commit Message

Jens Axboe Oct. 25, 2014, 7:20 p.m. UTC
On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
> On 25/10/14 16:47, Jens Axboe wrote:
>>
>> Since you're running rbd tests... Mind giving this patch a go? I don't
>> have an easy way to test it myself. It has nothing to do with this
>> issue, it's just a potentially faster way to do the rbd completions.
>>
> 
> Sure - but note I'm testing this on my i7 workstation (4x osd's running
> on 2x Crucial M550) so not exactly server grade :-)
> 
> With that in mind, I'm seeing slightly *slower* performance with the
> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
> cached.

Yeah, that doesn't look good. Mind trying this one out? I wonder if we
doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
working correctly. If you try this one, we should know more...

Goal is, I want to get rid of that usleep() in getevents.

Comments

Mark Kirkwood Oct. 25, 2014, 10:25 p.m. UTC | #1
On 26/10/14 08:20, Jens Axboe wrote:
> On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
>> On 25/10/14 16:47, Jens Axboe wrote:
>>>
>>> Since you're running rbd tests... Mind giving this patch a go? I don't
>>> have an easy way to test it myself. It has nothing to do with this
>>> issue, it's just a potentially faster way to do the rbd completions.
>>>
>>
>> Sure - but note I'm testing this on my i7 workstation (4x osd's running
>> on 2x Crucial M550) so not exactly server grade :-)
>>
>> With that in mind, I'm seeing slightly *slower* performance with the
>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
>> cached.
>
> Yeah, that doesn't look good. Mind trying this one out? I wonder if we
> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
> working correctly. If you try this one, we should know more...
>
> Goal is, I want to get rid of that usleep() in getevents.
>

Testing with v3 patch applied hangs. I did wonder if we had somehow hit 
a new variant of the cache issue - so reran with it disabled in 
ceph.conf. Result is the same:

$ fio read-test.fio
rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K, 
ioengine=rbd, iodepth=32
fio-2.1.13-88-gb2ee7
Starting 1 process
rbd engine: RBD version: 0.1.8
Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta 
01h:25m:15s]



--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ketor D Oct. 27, 2014, 10:25 a.m. UTC | #2
Hi Jens:
      After debug the v3 patch, I found there is a bug in the patch.
      On the first fio_rbd_getevents loop, the fri->io_seen is set to
1, and this variable never set to 0 again. So the program get into
endless loop in such code:

do {
this_events = rbd_iter_events(td, &events, min, wait);

if (events >= min)
break;
if (this_events)
continue;

wait = 1;
} while (1);

this_events and events always be 0, because the fri->io_seen is always
1, so no events can be getted.

The Bug fix is:
in the function _fio_rbd_finish_read_aiocb,
_fio_rbd_finish_write_aiocb and _fio_rbd_finish_sync_aiocb add
"fio_rbd_iou->io_seen = 0;" after "fio_rbd_iou->io_complete = 1;".


The attchment is the new patch.

2014-10-27 17:27 GMT+08:00 Ketor D <d.ketor@gmail.com>:
> Hi, Jens:
>           I have test your v2 and v3 patch.
>           The v2 patch get SIGABT and crash. The v3 patch hang.
>
>          Why not simply comment usleep?
>
>
> 2014-10-26 6:25 GMT+08:00 Mark Kirkwood <mark.kirkwood@catalyst.net.nz>:
>>
>> On 26/10/14 08:20, Jens Axboe wrote:
>>>
>>> On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
>>>>
>>>> On 25/10/14 16:47, Jens Axboe wrote:
>>>>>
>>>>>
>>>>> Since you're running rbd tests... Mind giving this patch a go? I don't
>>>>> have an easy way to test it myself. It has nothing to do with this
>>>>> issue, it's just a potentially faster way to do the rbd completions.
>>>>>
>>>>
>>>> Sure - but note I'm testing this on my i7 workstation (4x osd's running
>>>> on 2x Crucial M550) so not exactly server grade :-)
>>>>
>>>> With that in mind, I'm seeing slightly *slower* performance with the
>>>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
>>>> cached.
>>>
>>>
>>> Yeah, that doesn't look good. Mind trying this one out? I wonder if we
>>> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
>>> working correctly. If you try this one, we should know more...
>>>
>>> Goal is, I want to get rid of that usleep() in getevents.
>>>
>>
>> Testing with v3 patch applied hangs. I did wonder if we had somehow hit a
>> new variant of the cache issue - so reran with it disabled in ceph.conf.
>> Result is the same:
>>
>> $ fio read-test.fio
>> rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
>> ioengine=rbd, iodepth=32
>> fio-2.1.13-88-gb2ee7
>> Starting 1 process
>> rbd engine: RBD version: 0.1.8
>> Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>> 01h:25m:15s]
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
Jens Axboe Oct. 27, 2014, 2:15 p.m. UTC | #3
On 10/27/2014 03:27 AM, Ketor D wrote:
> Hi, Jens:
>           I have test your v2 and v3 patch.
>           The v2 patch get SIGABT and crash. The v3 patch hang.
> 
>          Why not simply comment usleep?

Because that is very inefficient as well, then fio would basically be
busy looping waiting for IO to finish.
Jens Axboe Oct. 27, 2014, 2:19 p.m. UTC | #4
On 10/25/2014 04:25 PM, Mark Kirkwood wrote:
> On 26/10/14 08:20, Jens Axboe wrote:
>> On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
>>> On 25/10/14 16:47, Jens Axboe wrote:
>>>>
>>>> Since you're running rbd tests... Mind giving this patch a go? I don't
>>>> have an easy way to test it myself. It has nothing to do with this
>>>> issue, it's just a potentially faster way to do the rbd completions.
>>>>
>>>
>>> Sure - but note I'm testing this on my i7 workstation (4x osd's running
>>> on 2x Crucial M550) so not exactly server grade :-)
>>>
>>> With that in mind, I'm seeing slightly *slower* performance with the
>>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
>>> cached.
>>
>> Yeah, that doesn't look good. Mind trying this one out? I wonder if we
>> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
>> working correctly. If you try this one, we should know more...
>>
>> Goal is, I want to get rid of that usleep() in getevents.
>>
> 
> Testing with v3 patch applied hangs. I did wonder if we had somehow hit
> a new variant of the cache issue - so reran with it disabled in
> ceph.conf. Result is the same:
> 
> $ fio read-test.fio
> rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
> ioengine=rbd, iodepth=32
> fio-2.1.13-88-gb2ee7
> Starting 1 process
> rbd engine: RBD version: 0.1.8
> Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
> 01h:25m:15s]

There were, unfortunately, still two bugs left in -v3. I just posted an
updated one, please try that and see if it works for you.
Ketor D Oct. 27, 2014, 3:12 p.m. UTC | #5
The v5 patch does not work.

Run 5 times:
3 times SEGSV
2 times NO IOPS, Endless loop



2014-10-27 22:19 GMT+08:00 Jens Axboe <axboe@kernel.dk>:
> On 10/25/2014 04:25 PM, Mark Kirkwood wrote:
>> On 26/10/14 08:20, Jens Axboe wrote:
>>> On 10/24/2014 10:50 PM, Mark Kirkwood wrote:
>>>> On 25/10/14 16:47, Jens Axboe wrote:
>>>>>
>>>>> Since you're running rbd tests... Mind giving this patch a go? I don't
>>>>> have an easy way to test it myself. It has nothing to do with this
>>>>> issue, it's just a potentially faster way to do the rbd completions.
>>>>>
>>>>
>>>> Sure - but note I'm testing this on my i7 workstation (4x osd's running
>>>> on 2x Crucial M550) so not exactly server grade :-)
>>>>
>>>> With that in mind, I'm seeing slightly *slower* performance with the
>>>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next
>>>> cached.
>>>
>>> Yeah, that doesn't look good. Mind trying this one out? I wonder if we
>>> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't
>>> working correctly. If you try this one, we should know more...
>>>
>>> Goal is, I want to get rid of that usleep() in getevents.
>>>
>>
>> Testing with v3 patch applied hangs. I did wonder if we had somehow hit
>> a new variant of the cache issue - so reran with it disabled in
>> ceph.conf. Result is the same:
>>
>> $ fio read-test.fio
>> rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K,
>> ioengine=rbd, iodepth=32
>> fio-2.1.13-88-gb2ee7
>> Starting 1 process
>> rbd engine: RBD version: 0.1.8
>> Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
>> 01h:25m:15s]
>
> There were, unfortunately, still two bugs left in -v3. I just posted an
> updated one, please try that and see if it works for you.
>
> --
> Jens Axboe
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/engines/rbd.c b/engines/rbd.c
index 6fe87b8d010c..2353b1f11caf 100644
--- a/engines/rbd.c
+++ b/engines/rbd.c
@@ -11,7 +11,9 @@ 
 
 struct fio_rbd_iou {
 	struct io_u *io_u;
+	rbd_completion_t completion;
 	int io_complete;
+	int io_seen;
 };
 
 struct rbd_data {
@@ -221,34 +223,69 @@  static struct io_u *fio_rbd_event(struct thread_data *td, int event)
 	return rbd_data->aio_events[event];
 }
 
-static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
-			     unsigned int max, const struct timespec *t)
+static inline int fri_check_complete(struct rbd_data *rbd_data,
+				     struct io_u *io_u,
+				     unsigned int *events)
+{
+	struct fio_rbd_iou *fri = io_u->engine_data;
+
+	if (fri->io_complete) {
+		fri->io_complete = 0;
+		fri->io_seen = 1;
+		rbd_data->aio_events[*events] = io_u;
+		(*events)++;
+		return 1;
+	}
+
+	return 0;
+}
+
+static int rbd_iter_events(struct thread_data *td, unsigned int *events,
+			   unsigned int min_evts, int wait)
 {
 	struct rbd_data *rbd_data = td->io_ops->data;
-	unsigned int events = 0;
+	unsigned int this_events = 0;
 	struct io_u *io_u;
 	int i;
-	struct fio_rbd_iou *fov;
 
-	do {
-		io_u_qiter(&td->io_u_all, io_u, i) {
-			if (!(io_u->flags & IO_U_F_FLIGHT))
-				continue;
+	io_u_qiter(&td->io_u_all, io_u, i) {
+		struct fio_rbd_iou *fri = io_u->engine_data;
 
-			fov = (struct fio_rbd_iou *)io_u->engine_data;
+		if (!(io_u->flags & IO_U_F_FLIGHT))
+			continue;
+		if (fri->io_seen)
+			continue;
 
-			if (fov->io_complete) {
-				fov->io_complete = 0;
-				rbd_data->aio_events[events] = io_u;
-				events++;
-			}
+		if (fri_check_complete(rbd_data, io_u, events))
+			this_events++;
+		else if (wait) {
+			rbd_aio_wait_for_complete(fri->completion);
 
+			if (fri_check_complete(rbd_data, io_u, events))
+				this_events++;
 		}
-		if (events < min)
-			usleep(100);
-		else
+		if (*events >= min_evts)
+			break;
+	}
+
+	return this_events;
+}
+
+static int fio_rbd_getevents(struct thread_data *td, unsigned int min,
+			     unsigned int max, const struct timespec *t)
+{
+	unsigned int this_events, events = 0;
+	int wait = 0;
+
+	do {
+		this_events = rbd_iter_events(td, &events, min, wait);
+
+		if (events >= min)
 			break;
+		if (this_events)
+			continue;
 
+		wait = 1;
 	} while (1);
 
 	return events;
@@ -258,7 +295,7 @@  static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
 {
 	int r = -1;
 	struct rbd_data *rbd_data = td->io_ops->data;
-	rbd_completion_t comp;
+	struct fio_rbd_iou *fri = io_u->engine_data;
 
 	fio_ro_check(td, io_u);
 
@@ -266,7 +303,7 @@  static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
 		r = rbd_aio_create_completion(io_u,
 					      (rbd_callback_t)
 					      _fio_rbd_finish_write_aiocb,
-					      &comp);
+					      &fri->completion);
 		if (r < 0) {
 			log_err
 			    ("rbd_aio_create_completion for DDIR_WRITE failed.\n");
@@ -274,7 +311,8 @@  static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
 		}
 
 		r = rbd_aio_write(rbd_data->image, io_u->offset,
-				  io_u->xfer_buflen, io_u->xfer_buf, comp);
+				  io_u->xfer_buflen, io_u->xfer_buf,
+				  fri->completion);
 		if (r < 0) {
 			log_err("rbd_aio_write failed.\n");
 			goto failed;
@@ -284,7 +322,7 @@  static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
 		r = rbd_aio_create_completion(io_u,
 					      (rbd_callback_t)
 					      _fio_rbd_finish_read_aiocb,
-					      &comp);
+					      &fri->completion);
 		if (r < 0) {
 			log_err
 			    ("rbd_aio_create_completion for DDIR_READ failed.\n");
@@ -292,7 +330,8 @@  static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
 		}
 
 		r = rbd_aio_read(rbd_data->image, io_u->offset,
-				 io_u->xfer_buflen, io_u->xfer_buf, comp);
+				 io_u->xfer_buflen, io_u->xfer_buf,
+				 fri->completion);
 
 		if (r < 0) {
 			log_err("rbd_aio_read failed.\n");
@@ -303,14 +342,14 @@  static int fio_rbd_queue(struct thread_data *td, struct io_u *io_u)
 		r = rbd_aio_create_completion(io_u,
 					      (rbd_callback_t)
 					      _fio_rbd_finish_sync_aiocb,
-					      &comp);
+					      &fri->completion);
 		if (r < 0) {
 			log_err
 			    ("rbd_aio_create_completion for DDIR_SYNC failed.\n");
 			goto failed;
 		}
 
-		r = rbd_aio_flush(rbd_data->image, comp);
+		r = rbd_aio_flush(rbd_data->image, fri->completion);
 		if (r < 0) {
 			log_err("rbd_flush failed.\n");
 			goto failed;
@@ -439,22 +478,21 @@  static int fio_rbd_invalidate(struct thread_data *td, struct fio_file *f)
 
 static void fio_rbd_io_u_free(struct thread_data *td, struct io_u *io_u)
 {
-	struct fio_rbd_iou *o = io_u->engine_data;
+	struct fio_rbd_iou *fri = io_u->engine_data;
 
-	if (o) {
+	if (fri) {
 		io_u->engine_data = NULL;
-		free(o);
+		free(fri);
 	}
 }
 
 static int fio_rbd_io_u_init(struct thread_data *td, struct io_u *io_u)
 {
-	struct fio_rbd_iou *o;
+	struct fio_rbd_iou *fri;
 
-	o = malloc(sizeof(*o));
-	o->io_complete = 0;
-	o->io_u = io_u;
-	io_u->engine_data = o;
+	fri = calloc(1, sizeof(*fri));
+	fri->io_u = io_u;
+	io_u->engine_data = fri;
 	return 0;
 }