write and read error handling problems

Message ID	CABAwU-ao6=DfMEeJZLZnYniVnzs+Gp7BYLoGv-36oUUPCR77Qg@mail.gmail.com (mailing list archive)
State	New, archived
Headers	show Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter2.kernel.org (8.14.4/8.14.4) with ESMTP id p7ACBApc006970 for <patchwork-ceph-devel@patchwork.kernel.org>; Wed, 10 Aug 2011 12:11:10 GMT MIME-Version: 1.0 Date: Wed, 10 Aug 2011 20:11:07 +0800 Message-ID: <CABAwU-ao6=DfMEeJZLZnYniVnzs+Gp7BYLoGv-36oUUPCR77Qg@mail.gmail.com> Subject: write and read error handling problems From: huang jun <hjwsm1989@gmail.com> To: ceph-devel <ceph-devel@vger.kernel.org> Content-Type: text/plain; charset=ISO-8859-1 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk

Message ID

CABAwU-ao6=DfMEeJZLZnYniVnzs+Gp7BYLoGv-36oUUPCR77Qg@mail.gmail.com (mailing list archive)

State

New, archived

Headers

MIME-Version: 1.0
Date: Wed, 10 Aug 2011 20:11:07 +0800
Message-ID: <CABAwU-ao6=DfMEeJZLZnYniVnzs+Gp7BYLoGv-36oUUPCR77Qg@mail.gmail.com>
Subject: write and read error handling problems
From: huang jun <hjwsm1989@gmail.com>
To: ceph-devel <ceph-devel@vger.kernel.org>
Content-Type: text/plain; charset=ISO-8859-1
Sender: ceph-devel-owner@vger.kernel.org
Precedence: bulk

Commit Message

huang jun Aug. 10, 2011, 12:11 p.m. UTC

hi,all
About OSD read ops, if osd got errors, it just return,
that may lead memory leak. we patched it.
thanks!
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Sage Weil Aug. 10, 2011, 2:36 p.m. UTC | #1

On Wed, 10 Aug 2011, huang jun wrote:

> hi,all
> About OSD read ops, if osd got errors, it just return,
> that may lead memory leak. we patched it.
> diff --git a/src/osd/ReplicatedPG.cc b/src/osd/ReplicatedPG.cc
> index 2ab21bb..21fbca7 100644
> --- a/src/osd/ReplicatedPG.cc
> +++ b/src/osd/ReplicatedPG.cc
> @@ -588,8 +588,18 @@ void ReplicatedPG::do_op(MOSDOp *op)
>      obc->ondisk_read_unlock();
>    }
> 
> -  if (result == -EAGAIN)
> +  if (result == -EAGAIN) {
> +    delete ctx;
>      return;
> +  }
> please have a check!

That was a leak, yep!  In the current master it's already fixed up (along 
with the obc and src_obc [something new]):

  if (result == -EAGAIN) {
    // clean up after the ctx
    delete ctx;
    put_object_context(obc);
    put_object_contexts(src_obc);
    return;
  }

> So i'm confused about the Error handling strategy of write/read
> operations in OSD.
> If the ceph just return when encountered errors, pass the work to client?
> Let's take an example of writing files. Client send request to write 4MB file,
> and OSD first write the osd journal, then return commit msg to Client.
> But, if the write file op was interrupted by the borken disk sector or
> other errors, that means write ops failed. What does the OSD going to
> do? Replay it from the former writen journal item? or other methods?

Currently if cosd gets an error back from the underlying file system it 
will make itself crash, effectively escalating a sector error into a 
failure of cosd itself.  If you (the admin) are able to repair the disk/fs 
and restart cosd, it will replay from the journal and continue.  
Otherwise you can replace the disk and it will recover the whole osd's 
data set from the rest of the cluster.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/src/osd/ReplicatedPG.cc b/src/osd/ReplicatedPG.cc
index 2ab21bb..21fbca7 100644
--- a/src/osd/ReplicatedPG.cc
+++ b/src/osd/ReplicatedPG.cc
@@ -588,8 +588,18 @@  void ReplicatedPG::do_op(MOSDOp *op)
     obc->ondisk_read_unlock();
   }

-  if (result == -EAGAIN)
+  if (result == -EAGAIN) {
+    delete ctx;
     return;
+  }
please have a check!

So i'm confused about the Error handling strategy of write/read
operations in OSD.
If the ceph just return when encountered errors, pass the work to client?
Let's take an example of writing files. Client send request to write 4MB file,
and OSD first write the osd journal, then return commit msg to Client.
But, if the write file op was interrupted by the borken disk sector or
other errors, that means write ops failed. What does the OSD going to
do? Replay it from the former writen journal item? or other methods?

write and read error handling problems

Commit Message

Comments

Patch