From patchwork Fri Jan 4 14:53:54 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Elder X-Patchwork-Id: 1933371 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id B99D0DFABD for ; Fri, 4 Jan 2013 14:53:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754786Ab3ADOx5 (ORCPT ); Fri, 4 Jan 2013 09:53:57 -0500 Received: from mail-ie0-f176.google.com ([209.85.223.176]:41884 "EHLO mail-ie0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754763Ab3ADOx4 (ORCPT ); Fri, 4 Jan 2013 09:53:56 -0500 Received: by mail-ie0-f176.google.com with SMTP id 13so19495575iea.7 for ; Fri, 04 Jan 2013 06:53:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding :x-gm-message-state; bh=Nt4bqef6VACfrdnsIAkupgX4EwLMQneHPjFRoFAZwYA=; b=Zx++DnkIGyIU9lBJd8HL/60MCfHjanEM5JYsl95VQnPJwLuDwk2hhAgfDfesNbaxcR HTzcLwOED09Z6ALguPuVJbttvip/MEwDY4lt17/0v7ONRLW8aS5l8rm56imjpSnwcsEH UUh9a4CDAwYrFRk97/srU97R5l7UxwxsNMZopWgxXM+E+4RQEXLpL8KKR5skC9B16LHX mUsxqjUeDTZcvBc6Dn2E/T4EysragXJ+6xmJ1QEjOmgX6VbPOR8XcpmIqTiEIRIf2NBw SqVhsojBpzBNcHzvGQ5l6DRl+PlwnBa0giXGv7S4d3jKYf6efcuN9T5fds+nUlZCpZEy bo9g== X-Received: by 10.50.152.243 with SMTP id vb19mr44832375igb.16.1357311236232; Fri, 04 Jan 2013 06:53:56 -0800 (PST) Received: from [172.22.22.4] (c-71-195-31-37.hsd1.mn.comcast.net. [71.195.31.37]) by mx.google.com with ESMTPS id b13sm43986533igp.7.2013.01.04.06.53.53 (version=SSLv3 cipher=OTHER); Fri, 04 Jan 2013 06:53:54 -0800 (PST) Message-ID: <50E6ED02.5070009@inktank.com> Date: Fri, 04 Jan 2013 08:53:54 -0600 From: Alex Elder User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: "ceph-devel@vger.kernel.org" Subject: [PATCH REPOST 3/3] rbd: don't bother calculating file mapping References: <50E6EC84.2060304@inktank.com> In-Reply-To: <50E6EC84.2060304@inktank.com> X-Gm-Message-State: ALoCoQnFsppNaWq7YnXCmUF8zDFzw9Y+o2Q5ok/jPl1zDjHNlMdo2tSaleTapUZpcsZOSpAzMH9J Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When rbd_do_request() has a request to process it initializes a ceph file layout structure and uses it to compute offsets and limits for the range of the request using ceph_calc_file_object_mapping(). The layout used is fixed, and is based on RBD_MAX_OBJ_ORDER (30). It sets the layout's object size and stripe unit to be 1 GB (2^30), and sets the stripe count to be 1. The job of ceph_calc_file_object_mapping() is to determine which of a sequence of objects will contain data covered by range, and within that object, at what offset the range starts. It also truncates the length of the range at the end of the selected object if necessary. This is needed for ceph fs, but for rbd it really serves no purpose. It does its own blocking of images into objects, echo of which is (1 << obj_order) in size, and as a result it ignores the "bno" value returned by ceph_calc_file_object_mapping(). In addition, by the point a request has reached this function, it is already destined for a single rbd object, and its length will not exceed that object's extent. Because of this, and because the mapping will result in blocking up the range using an integer multiple of the image's object order, ceph_calc_file_object_mapping() will never change the offset or length values defined by the request. In other words, this call is a big no-op for rbd data requests. There is one exception. We read the header object using this function, and in that case we will not have already limited the request size. However, the header is a single object (not a file or rbd image), and should not be broken into pieces anyway. So in fact we should *not* be calling ceph_calc_file_object_mapping() when operating on the header object. So... Don't call ceph_calc_file_object_mapping() in rbd_do_request(), because useless for image data and incorrect to do sofor the image header. Signed-off-by: Alex Elder --- drivers/block/rbd.c | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index e6db737..072608e 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -1123,9 +1123,6 @@ static int rbd_do_request(struct request *rq, { struct ceph_osd_request *osd_req; int ret; - u64 bno; - u64 obj_off = 0; - u64 obj_len = 0; struct timespec mtime = CURRENT_TIME; struct rbd_request *rbd_req; struct ceph_osd_client *osdc; @@ -1169,19 +1166,12 @@ static int rbd_do_request(struct request *rq, osd_req->r_oid_len = strlen(osd_req->r_oid); rbd_layout_init(&osd_req->r_file_layout, rbd_dev->spec->pool_id); - ret = ceph_calc_file_object_mapping(&osd_req->r_file_layout, ofs, len, - &bno, &obj_off, &obj_len); - rbd_assert(ret == 0); - if (obj_len < len) { - dout(" skipping last %llu, final file extent %llu~%llu\n", - len - obj_len, ofs, obj_len); - len = obj_len; - } + if (op->op == CEPH_OSD_OP_READ || op->op == CEPH_OSD_OP_WRITE) { - op->extent.offset = obj_off; - op->extent.length = obj_len; + op->extent.offset = ofs; + op->extent.length = len; if (op->op == CEPH_OSD_OP_WRITE) - op->payload_len = obj_len; + op->payload_len = len; } osd_req->r_num_pages = calc_pages_for(ofs, len); osd_req->r_page_alignment = ofs & ~PAGE_MASK;