From patchwork Mon Sep 17 17:30:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 10603153 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C5C3157B for ; Mon, 17 Sep 2018 17:31:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2A0BE2A2DE for ; Mon, 17 Sep 2018 17:31:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2890F2A2E7; Mon, 17 Sep 2018 17:31:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 925502A2E6 for ; Mon, 17 Sep 2018 17:31:12 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id A894C21F971; Mon, 17 Sep 2018 10:31:00 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id BA5D821F36F for ; Mon, 17 Sep 2018 10:30:48 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id AE8C6470; Mon, 17 Sep 2018 13:30:46 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id A74372B9; Mon, 17 Sep 2018 13:30:46 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Sep 2018 13:30:12 -0400 Message-Id: <1537205440-6656-3-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1537205440-6656-1-git-send-email-jsimmons@infradead.org> References: <1537205440-6656-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 02/30] lustre: uapi: add documentation about FIDs X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Niu Yawei Add details about FIDs in lustre. FIDs are exposed to user land so users can manage them. This provides the users details to assist them. Signed-off-by: Niu Yawei WC-bug-id: https://jira.whamcloud.com/browse/LU-8998 Reviewed-on: https://review.whamcloud.com/24867 Reviewed-by: Andreas Dilger Reviewed-by: Jinshan Xiong Signed-off-by: James Simmons --- .../lustre/include/uapi/linux/lustre/lustre_idl.h | 84 ++++++++++++++++++++++ 1 file changed, 84 insertions(+) diff --git a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h index bec1028..58321eb 100644 --- a/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h +++ b/drivers/staging/lustre/include/uapi/linux/lustre/lustre_idl.h @@ -223,6 +223,90 @@ enum { #define LL_HSM_MAX_ARCHIVE (sizeof(__u32) * 8) /** + * Different FID Format + * http://arch.lustre.org/index.php?title=Interoperability_fids_zfs#NEW.0 + * + * FID: + * File IDentifier generated by client from range allocated by the seq service. + * First 0x400 sequences [2^33, 2^33 + 0x400] are reserved for system use. Note + * that on ldiskfs MDTs that IGIF FIDs can use inode numbers starting at 12, + * but this is in the IGIF SEQ rangeand does not conflict with assigned FIDs. + * + * IGIF: + * Inode and Generation In FID, a surrogate FID used to globally identify an + * existing object on OLD formatted MDT file system. This would only be used on + * MDT0 in a DNE filesystem, because there are not expected to be any OLD + * formatted DNE filesystems. Belongs to a sequence in [12, 2^32 - 1] range, + * where sequence number is inode number, and inode generation is used as OID. + * NOTE: This assumes no more than 2^32-1 inodes exist in the MDT filesystem, + * which is the maximum possible for an ldiskfs backend. NOTE: This assumes + * that the reserved ext3/ext4/ldiskfs inode numbers [0-11] are never visible + * to clients, which has always been true. + * + * IDIF: + * Object ID in FID, a surrogate FID used to globally identify an existing + * object on OLD formatted OST file system. Belongs to a sequence in + * [2^32, 2^33 - 1]. Sequence number is calculated as: + * 1 << 32 | (ost_index << 16) | ((objid >> 32) & 0xffff) + * that is, SEQ consists of 16-bit OST index, and higher 16 bits of object ID. + * The generation of unique SEQ values per OST allows the IDIF FIDs to be + * identified in the FLD correctly. The OID field is calculated as: + * objid & 0xffffffff + * that is, it consists of lower 32 bits of object ID. NOTE This assumes that + * no more than 2^48-1 objects have ever been created on an OST, and that no + * more than 65535 OSTs are in use. Both are very reasonable assumptions (can + * uniquely map all objects on an OST that created 1M objects per second for 9 + * years, or combinations thereof). + * + * OST_MDT0: + * Surrogate FID used to identify an existing object on OLD formatted OST + * filesystem. Belongs to the reserved sequence 0, and is used internally prior + * to the introduction of FID-on-OST, at which point IDIF will be used to + * identify objects as residing on a specific OST. + * + * LLOG: + * For Lustre Log objects the object sequence 1 is used. This is compatible with + * both OLD and NEW.1 namespaces, as this SEQ number is in the ext3/ldiskfs + * reserved inode range and does not conflict with IGIF sequence numbers. + * + * ECHO: + * For testing OST IO performance the object sequence 2 is used. This is + * compatible with both OLD and NEW.1 namespaces, as this SEQ number is in the + * ext3/ldiskfs reserved inode range and does not conflict with IGIF sequence + * numbers. + * + * OST_MDT1 .. OST_MAX: + * For testing with multiple MDTs the object sequence 3 through 9 is used, + * allowing direct mapping of MDTs 1 through 7 respectively, for a total of 8 + * MDTs including OST_MDT0. This matches the legacy CMD project "group" + * mappings. However, this SEQ range is only for testing prior to any production + * DNE release, as the objects in this range conflict across all OSTs, as the + * OST index is not part of the FID. + * + * + * For compatibility with existing OLD OST network protocol structures, the FID + * must map onto the o_id and o_gr in a manner that ensures existing objects are + * identified consistently for IO, as well as onto the lock namespace to ensure + * both IDIFs map onto the same objects for IO as well as resources in the DLM. + * + * DLM OLD OBIF/IDIF: + * resource[] = {o_id, o_seq, 0, 0}; // o_seq == 0 for production releases + * + * DLM NEW.1 FID (this is the same for both the MDT and OST): + * resource[] = {SEQ, OID, VER, HASH}; + * + * Note that for mapping IDIF values to DLM resource names the o_id may be + * larger than the 2^33 reserved sequence numbers for IDIF, so it is possible + * for the o_id numbers to overlap FID SEQ numbers in the resource. However, in + * all production releases the OLD o_seq field is always zero, and all valid FID + * OID values are non-zero, so the lock resources will not collide. + * + * For objects within the IDIF range, group extraction (non-CMD) will be: + * o_id = (fid->f_seq & 0x7fff) << 16 | fid->f_oid; + * o_seq = 0; // formerly group number + */ + +/** * Note that reserved SEQ numbers below 12 will conflict with ldiskfs * inodes in the IGIF namespace, so these reserved SEQ numbers can be * used for other purposes and not risk collisions with existing inodes.