From patchwork Tue Oct 21 02:49:52 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alexandre Oliva X-Patchwork-Id: 5108901 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 6B83B9F349 for ; Tue, 21 Oct 2014 02:50:55 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 06D2E201ED for ; Tue, 21 Oct 2014 02:50:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9F1FD201D3 for ; Tue, 21 Oct 2014 02:50:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754381AbaJUCut (ORCPT ); Mon, 20 Oct 2014 22:50:49 -0400 Received: from linux-libre.fsfla.org ([208.118.235.54]:38269 "EHLO linux-libre.fsfla.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754247AbaJUCus (ORCPT ); Mon, 20 Oct 2014 22:50:48 -0400 Received: from freie.home (home.lxoliva.fsfla.org [172.31.160.22]) by linux-libre.fsfla.org (8.14.4/8.14.4/Debian-2ubuntu2.1) with ESMTP id s9L2oiR5016527; Tue, 21 Oct 2014 02:50:44 GMT Received: from free.home (free.home [172.31.160.1]) by freie.home (8.14.8/8.14.8) with ESMTP id s9L2nt94005848; Tue, 21 Oct 2014 00:50:01 -0200 From: Alexandre Oliva To: Sage Weil Cc: sam.just@inktank.com, ceph-devel@vger.kernel.org Subject: Re: [PATCH] reinstate ceph cluster_snap support Organization: Free thinker, not speaking for the GNU Project References: Date: Tue, 21 Oct 2014 00:49:52 -0200 In-Reply-To: (Sage Weil's message of "Tue, 27 Aug 2013 15:21:52 -0700 (PDT)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_TVD_MIME_EPI, T_TVD_MIME_NO_HEADERS, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Aug 27, 2013, Sage Weil wrote: > Finally, eventually we should make this do a checkpoint on the mons too. > We can add the osd snapping back in first, but before this can/should > really be used the mons need to be snapshotted as well. Probably that's > just adding in a snapshot() method to MonitorStore.h and doing either a > leveldb snap or making a full copy of store.db... I forget what leveldb is > capable of here. I suppose it might be a bit too late for Giant, but I finally got 'round to implementing this. I attach the patch that implements it, to be applied on top of the updated version of the patch I posted before, also attached. I have a backport to Firefly too, if there's interest. I have tested both methods: btrfs snapshotting of store.db (I've manually turned store.db into a btrfs subvolume), and creating a new db with all (prefix,key,value) triples. I'm undecided about inserting multiple transaction commits for the latter case; the mon mem use grew up a lot as it was, and in a few tests the snapshotting ran twice, but in the end a dump of all the data in the database created by btrfs snapshotting was identical to that created by explicit copying. So, the former is preferred, since it's so incredibly more efficient. I also considered hardlinking all files in store.db into a separate tree, but I didn't like the idea of coding that in C+-, :-) and I figured it might not work with other db backends, and maybe even not be guaranteed to work with leveldb. It's probably not worth much more effort. reinstate ceph cluster_snap support From: Alexandre Oliva This patch brings back and updates (for dumpling) the code originally introduced to support “ceph osd cluster_snap ”, that was disabled and partially removed before cuttlefish. Some minimal testing appears to indicate this even works: the modified mon actually generated an osdmap with the cluster_snap request, and starting a modified osd that was down and letting it catch up caused the osd to take the requested snapshot. I see no reason why it wouldn't have taken it if it was up and running, so... Why was this feature disabled in the first place? Signed-off-by: Alexandre Oliva --- src/mon/MonCommands.h | 6 ++++-- src/mon/OSDMonitor.cc | 11 +++++++---- src/osd/OSD.cc | 8 ++++++++ 3 files changed, 19 insertions(+), 6 deletions(-) diff --git a/src/mon/MonCommands.h b/src/mon/MonCommands.h index d702615..8f468f4 100644 --- a/src/mon/MonCommands.h +++ b/src/mon/MonCommands.h @@ -499,8 +499,10 @@ COMMAND("osd set " \ COMMAND("osd unset " \ "name=key,type=CephChoices,strings=pause|noup|nodown|noout|noin|nobackfill|norecover|noscrub|nodeep-scrub|notieragent", \ "unset ", "osd", "rw", "cli,rest") -COMMAND("osd cluster_snap", "take cluster snapshot (disabled)", \ - "osd", "r", "") +COMMAND("osd cluster_snap " \ + "name=snap,type=CephString", \ + "take cluster snapshot", \ + "osd", "r", "cli") COMMAND("osd down " \ "type=CephString,name=ids,n=N", \ "set osd(s) [...] down", "osd", "rw", "cli,rest") diff --git a/src/mon/OSDMonitor.cc b/src/mon/OSDMonitor.cc index bfcc09e..b237846 100644 --- a/src/mon/OSDMonitor.cc +++ b/src/mon/OSDMonitor.cc @@ -4766,10 +4766,13 @@ bool OSDMonitor::prepare_command_impl(MMonCommand *m, } } else if (prefix == "osd cluster_snap") { - // ** DISABLE THIS FOR NOW ** - ss << "cluster snapshot currently disabled (broken implementation)"; - // ** DISABLE THIS FOR NOW ** - + string snap; + cmd_getval(g_ceph_context, cmdmap, "snap", snap); + pending_inc.cluster_snapshot = snap; + ss << "creating cluster snap " << snap; + getline(ss, rs); + wait_for_finished_proposal(new Monitor::C_Command(mon, m, 0, rs, get_last_committed())); + return true; } else if (prefix == "osd down" || prefix == "osd out" || prefix == "osd in" || diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc index f2f5df5..eb4f246 100644 --- a/src/osd/OSD.cc +++ b/src/osd/OSD.cc @@ -6310,6 +6310,14 @@ void OSD::handle_osd_map(MOSDMap *m) } } + string cluster_snap = newmap->get_cluster_snapshot(); + if (cluster_snap.length()) { + dout(0) << "creating cluster snapshot '" << cluster_snap << "'" << dendl; + int r = store->snapshot(cluster_snap); + if (r) + dout(0) << "failed to create cluster snapshot: " << cpp_strerror(r) << dendl; + } + osdmap = newmap; superblock.current_epoch = cur;