From patchwork Thu Aug 4 01:37:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0D72DC19F2D for ; Thu, 4 Aug 2022 01:38:29 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyrwN2fQqz21H4; Wed, 3 Aug 2022 18:38:28 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwL16CVz1y73 for ; Wed, 3 Aug 2022 18:38:26 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id A35F7100AFEF; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 9D60582CCE; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:46 -0400 Message-Id: <1659577097-19253-2-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 01/32] lustre: mdc: Remove entry from list before freeing X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Oleg Drokin mdc_changelog_cdev_init forgot to remove entries from list if chardev allocation failed Fixes: dcedf3009a71 ("lustre: changelog: support large number of MDT") WC-bug-id: https://jira.whamcloud.com/browse/LU-15901 Lustre-commit: 441ec2296a0938dd3 ("LU-15901 mdc: Remove entry from list before freeing") Signed-off-by: Oleg Drokin Reviewed-on: https://review.whamcloud.com/47480 Reviewed-by: Andreas Dilger Reviewed-by: Lai Siyao Reviewed-by: James Simmons Signed-off-by: James Simmons --- fs/lustre/mdc/mdc_changelog.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/lustre/mdc/mdc_changelog.c b/fs/lustre/mdc/mdc_changelog.c index d366720..36d7fdd 100644 --- a/fs/lustre/mdc/mdc_changelog.c +++ b/fs/lustre/mdc/mdc_changelog.c @@ -837,7 +837,7 @@ int mdc_changelog_cdev_init(struct obd_device *obd) rc = chlg_minor_alloc(&minor); if (rc) - goto out_unlock; + goto out_listrm; device_initialize(&entry->ced_device); entry->ced_device.devt = MKDEV(MAJOR(mdc_changelog_dev), minor); @@ -866,6 +866,7 @@ int mdc_changelog_cdev_init(struct obd_device *obd) out_minor: chlg_minor_free(minor); +out_listrm: list_del_init(&obd->u.cli.cl_chg_dev_linkage); list_del(&entry->ced_link); From patchwork Thu Aug 4 01:37:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 91906C19F29 for ; Thu, 4 Aug 2022 01:38:35 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyrwS5Zr6z23J9; Wed, 3 Aug 2022 18:38:32 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwL6QBJz1y73 for ; Wed, 3 Aug 2022 18:38:26 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id A8F84100AFF2; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 9FBE68BBFC; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:47 -0400 Message-Id: <1659577097-19253-3-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 02/32] lustre: flr: Don't assume RDONLY implies SOM X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Patrick Farrell In lov_io_slice_mirror_init, the client code assumes that the LCM_FL_RDONLY flag in the layout implies SOM and skips glimpse if it sees one. The RDONLY flag means the mirrors are in sync, which has historically implied SOM is valid. To start with, using LCM_FL_RDONLY to imply SOM is sort of a layering violation. SOM is only communicated from the MDS when it is valid, and the client already skips glimpse in that case, so this duplicates functionality from the higher layers. More seriously, patch: "LU-14526 flr: mirror split downgrade SOM" (https://review.whamcloud.com/43168/) Made it possible to have LCM_FL_RDONLY but not strict SOM, so this assumption is no longer correct. The fix is to not look at LCM_FL_RDONLY when deciding whether to glimpse a file for size. WC-bug-id: https://jira.whamcloud.com/browse/LU-15609 Lustre-commit: 250108ad754cfa932 ("LU-15609 flr: Don't assume RDONLY implies SOM") Signed-off-by: Patrick Farrell Reviewed-on: https://review.whamcloud.com/46666 Reviewed-by: John L. Hammond Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_support.h | 7 +++---- fs/lustre/lov/lov_io.c | 7 ------- 2 files changed, 3 insertions(+), 11 deletions(-) diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h index 0732fe9a..b6c8a72 100644 --- a/fs/lustre/include/obd_support.h +++ b/fs/lustre/include/obd_support.h @@ -515,10 +515,9 @@ #define OBD_FAIL_UNKNOWN_LMV_STRIPE 0x1901 /* FLR */ -#define OBD_FAIL_FLR_GLIMPSE_IMMUTABLE 0x1A00 -#define OBD_FAIL_FLR_LV_DELAY 0x1A01 -#define OBD_FAIL_FLR_LV_INC 0x1A02 -#define OBD_FAIL_FLR_RANDOM_PICK_MIRROR 0x1A03 +#define OBD_FAIL_FLR_LV_DELAY 0x1A01 +#define OBD_FAIL_FLR_LV_INC 0x1A02 +#define OBD_FAIL_FLR_RANDOM_PICK_MIRROR 0x1A03 /* LNet is allocated failure locations 0xe000 to 0xffff */ /* Assign references to moved code to reduce code changes */ diff --git a/fs/lustre/lov/lov_io.c b/fs/lustre/lov/lov_io.c index b535092..32f028b 100644 --- a/fs/lustre/lov/lov_io.c +++ b/fs/lustre/lov/lov_io.c @@ -540,13 +540,6 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj, case CIT_GLIMPSE: lio->lis_pos = 0; lio->lis_endpos = OBD_OBJECT_EOF; - - if (lov_flr_state(obj) == LCM_FL_RDONLY && - !OBD_FAIL_CHECK(OBD_FAIL_FLR_GLIMPSE_IMMUTABLE)) { - /* SoM is accurate, no need glimpse */ - result = 1; - goto out; - } break; case CIT_MISC: From patchwork Thu Aug 4 01:37:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935974 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5D186C19F2A for ; Thu, 4 Aug 2022 01:38:37 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyrwW17KLz23JW; Wed, 3 Aug 2022 18:38:35 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwM4Cv5z1y7h for ; Wed, 3 Aug 2022 18:38:27 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id AC700100AFF3; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id A39CB8D620; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:48 -0400 Message-Id: <1659577097-19253-4-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 03/32] lustre: echo: remove client operations from echo objects X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: "John L. Hammond" Remove the client (io, page, lock) operations from echo_client objects. This will facilitate the simplification of CLIO. WC-bug-id: https://jira.whamcloud.com/browse/LU-10994 Lustre-commit: 6060ee55b194e37e8 ("LU-10994 echo: remove client operations from echo objects") Signed-off-by: John L. Hammond Reviewed-on: https://review.whamcloud.com/47240 Reviewed-by: Patrick Farrell Reviewed-by: Bobi Jam Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/obdecho/echo_client.c | 542 +--------------------------------------- 1 file changed, 8 insertions(+), 534 deletions(-) diff --git a/fs/lustre/obdecho/echo_client.c b/fs/lustre/obdecho/echo_client.c index 4cc046a..f25ea41 100644 --- a/fs/lustre/obdecho/echo_client.c +++ b/fs/lustre/obdecho/echo_client.c @@ -70,7 +70,6 @@ struct echo_object { struct echo_device *eo_dev; struct list_head eo_obj_chain; struct lov_oinfo *eo_oinfo; - atomic_t eo_npages; int eo_deleted; }; @@ -79,19 +78,6 @@ struct echo_object_conf { struct lov_oinfo **eoc_oinfo; }; -struct echo_page { - struct cl_page_slice ep_cl; - unsigned long ep_lock; -}; - -struct echo_lock { - struct cl_lock_slice el_cl; - struct list_head el_chain; - struct echo_object *el_object; - u64 el_cookie; - atomic_t el_refcount; -}; - static int echo_client_setup(const struct lu_env *env, struct obd_device *obd, struct lustre_cfg *lcfg); @@ -100,48 +86,34 @@ static int echo_client_setup(const struct lu_env *env, /** \defgroup echo_helpers Helper functions * @{ */ -static inline struct echo_device *cl2echo_dev(const struct cl_device *dev) +static struct echo_device *cl2echo_dev(const struct cl_device *dev) { return container_of_safe(dev, struct echo_device, ed_cl); } -static inline struct cl_device *echo_dev2cl(struct echo_device *d) +static struct cl_device *echo_dev2cl(struct echo_device *d) { return &d->ed_cl; } -static inline struct echo_device *obd2echo_dev(const struct obd_device *obd) +static struct echo_device *obd2echo_dev(const struct obd_device *obd) { return cl2echo_dev(lu2cl_dev(obd->obd_lu_dev)); } -static inline struct cl_object *echo_obj2cl(struct echo_object *eco) +static struct cl_object *echo_obj2cl(struct echo_object *eco) { return &eco->eo_cl; } -static inline struct echo_object *cl2echo_obj(const struct cl_object *o) +static struct echo_object *cl2echo_obj(const struct cl_object *o) { return container_of(o, struct echo_object, eo_cl); } -static inline struct echo_page *cl2echo_page(const struct cl_page_slice *s) -{ - return container_of(s, struct echo_page, ep_cl); -} - -static inline struct echo_lock *cl2echo_lock(const struct cl_lock_slice *s) -{ - return container_of(s, struct echo_lock, el_cl); -} - -static inline struct cl_lock *echo_lock2cl(const struct echo_lock *ecl) -{ - return ecl->el_cl.cls_lock; -} - static struct lu_context_key echo_thread_key; -static inline struct echo_thread_info *echo_env_info(const struct lu_env *env) + +static struct echo_thread_info *echo_env_info(const struct lu_env *env) { struct echo_thread_info *info; @@ -158,16 +130,10 @@ struct echo_object_conf *cl2echo_conf(const struct cl_object_conf *c) /** @} echo_helpers */ static int cl_echo_object_put(struct echo_object *eco); -static int cl_echo_object_brw(struct echo_object *eco, int rw, u64 offset, - struct page **pages, int npages, int async); struct echo_thread_info { struct echo_object_conf eti_conf; struct lustre_md eti_md; - - struct cl_2queue eti_queue; - struct cl_io eti_io; - struct cl_lock eti_lock; struct lu_fid eti_fid; struct lu_fid eti_fid2; }; @@ -177,18 +143,12 @@ struct echo_session_info { unsigned long dummy; }; -static struct kmem_cache *echo_lock_kmem; static struct kmem_cache *echo_object_kmem; static struct kmem_cache *echo_thread_kmem; static struct kmem_cache *echo_session_kmem; static struct lu_kmem_descr echo_caches[] = { { - .ckd_cache = &echo_lock_kmem, - .ckd_name = "echo_lock_kmem", - .ckd_size = sizeof(struct echo_lock) - }, - { .ckd_cache = &echo_object_kmem, .ckd_name = "echo_object_kmem", .ckd_size = sizeof(struct echo_object) @@ -208,191 +168,6 @@ struct echo_session_info { } }; -/** \defgroup echo_page Page operations - * - * Echo page operations. - * - * @{ - */ -static int echo_page_own(const struct lu_env *env, - const struct cl_page_slice *slice, - struct cl_io *io, int nonblock) -{ - struct echo_page *ep = cl2echo_page(slice); - - if (nonblock) { - if (test_and_set_bit(0, &ep->ep_lock)) - return -EAGAIN; - } else { - while (test_and_set_bit(0, &ep->ep_lock)) - wait_on_bit(&ep->ep_lock, 0, TASK_UNINTERRUPTIBLE); - } - return 0; -} - -static void echo_page_disown(const struct lu_env *env, - const struct cl_page_slice *slice, - struct cl_io *io) -{ - struct echo_page *ep = cl2echo_page(slice); - - LASSERT(test_bit(0, &ep->ep_lock)); - clear_and_wake_up_bit(0, &ep->ep_lock); -} - -static void echo_page_discard(const struct lu_env *env, - const struct cl_page_slice *slice, - struct cl_io *unused) -{ - cl_page_delete(env, slice->cpl_page); -} - -static int echo_page_is_vmlocked(const struct lu_env *env, - const struct cl_page_slice *slice) -{ - if (test_bit(0, &cl2echo_page(slice)->ep_lock)) - return -EBUSY; - return -ENODATA; -} - -static void echo_page_completion(const struct lu_env *env, - const struct cl_page_slice *slice, - int ioret) -{ - LASSERT(slice->cpl_page->cp_sync_io); -} - -static void echo_page_fini(const struct lu_env *env, - struct cl_page_slice *slice, - struct pagevec *pvec) -{ - struct echo_object *eco = cl2echo_obj(slice->cpl_obj); - - atomic_dec(&eco->eo_npages); - put_page(slice->cpl_page->cp_vmpage); -} - -static int echo_page_prep(const struct lu_env *env, - const struct cl_page_slice *slice, - struct cl_io *unused) -{ - return 0; -} - -static int echo_page_print(const struct lu_env *env, - const struct cl_page_slice *slice, - void *cookie, lu_printer_t printer) -{ - struct echo_page *ep = cl2echo_page(slice); - - (*printer)(env, cookie, LUSTRE_ECHO_CLIENT_NAME "-page@%p %d vm@%p\n", - ep, test_bit(0, &ep->ep_lock), - slice->cpl_page->cp_vmpage); - return 0; -} - -static const struct cl_page_operations echo_page_ops = { - .cpo_own = echo_page_own, - .cpo_disown = echo_page_disown, - .cpo_discard = echo_page_discard, - .cpo_fini = echo_page_fini, - .cpo_print = echo_page_print, - .cpo_is_vmlocked = echo_page_is_vmlocked, - .io = { - [CRT_READ] = { - .cpo_prep = echo_page_prep, - .cpo_completion = echo_page_completion, - }, - [CRT_WRITE] = { - .cpo_prep = echo_page_prep, - .cpo_completion = echo_page_completion, - } - } -}; - -/** @} echo_page */ - -/** \defgroup echo_lock Locking - * - * echo lock operations - * - * @{ - */ -static void echo_lock_fini(const struct lu_env *env, - struct cl_lock_slice *slice) -{ - struct echo_lock *ecl = cl2echo_lock(slice); - - LASSERT(list_empty(&ecl->el_chain)); - kmem_cache_free(echo_lock_kmem, ecl); -} - -static const struct cl_lock_operations echo_lock_ops = { - .clo_fini = echo_lock_fini, -}; - -/** @} echo_lock */ - -/** \defgroup echo_cl_ops cl_object operations - * - * operations for cl_object - * - * @{ - */ -static int echo_page_init(const struct lu_env *env, struct cl_object *obj, - struct cl_page *page, pgoff_t index) -{ - struct echo_page *ep = cl_object_page_slice(obj, page); - struct echo_object *eco = cl2echo_obj(obj); - - get_page(page->cp_vmpage); - /* - * ep_lock is similar to the lock_page() lock, and - * cannot usefully be monitored by lockdep. - * So just a bit in an "unsigned long" and use the - * wait_on_bit() interface to wait for the bit to be clera. - */ - ep->ep_lock = 0; - cl_page_slice_add(page, &ep->ep_cl, obj, &echo_page_ops); - atomic_inc(&eco->eo_npages); - return 0; -} - -static int echo_io_init(const struct lu_env *env, struct cl_object *obj, - struct cl_io *io) -{ - return 0; -} - -static int echo_lock_init(const struct lu_env *env, - struct cl_object *obj, struct cl_lock *lock, - const struct cl_io *unused) -{ - struct echo_lock *el; - - el = kmem_cache_zalloc(echo_lock_kmem, GFP_NOFS); - if (el) { - cl_lock_slice_add(lock, &el->el_cl, obj, &echo_lock_ops); - el->el_object = cl2echo_obj(obj); - INIT_LIST_HEAD(&el->el_chain); - atomic_set(&el->el_refcount, 0); - } - return !el ? -ENOMEM : 0; -} - -static int echo_conf_set(const struct lu_env *env, struct cl_object *obj, - const struct cl_object_conf *conf) -{ - return 0; -} - -static const struct cl_object_operations echo_cl_obj_ops = { - .coo_page_init = echo_page_init, - .coo_lock_init = echo_lock_init, - .coo_io_init = echo_io_init, - .coo_conf_set = echo_conf_set -}; - /** @} echo_cl_ops */ /** \defgroup echo_lu_ops lu_object operations @@ -434,8 +209,7 @@ static int echo_object_init(const struct lu_env *env, struct lu_object *obj, *econf->eoc_oinfo = NULL; eco->eo_dev = ed; - atomic_set(&eco->eo_npages, 0); - cl_object_page_init(lu2cl(obj), sizeof(struct echo_page)); + cl_object_page_init(lu2cl(obj), 0); spin_lock(&ec->ec_lock); list_add_tail(&eco->eo_obj_chain, &ec->ec_objects); @@ -455,8 +229,6 @@ static void echo_object_delete(const struct lu_env *env, struct lu_object *obj) ec = eco->eo_dev->ed_ec; - LASSERT(atomic_read(&eco->eo_npages) == 0); - spin_lock(&ec->ec_lock); list_del_init(&eco->eo_obj_chain); spin_unlock(&ec->ec_lock); @@ -527,7 +299,6 @@ static struct lu_object *echo_object_alloc(const struct lu_env *env, lu_object_init(obj, &hdr->coh_lu, dev); lu_object_add_top(&hdr->coh_lu, obj); - eco->eo_cl.co_ops = &echo_cl_obj_ops; obj->lo_ops = &echo_lu_obj_ops; } return obj; @@ -741,15 +512,6 @@ static struct lu_device *echo_device_fini(const struct lu_env *env, return NULL; } -static void echo_lock_release(const struct lu_env *env, - struct echo_lock *ecl, - int still_used) -{ - struct cl_lock *clk = echo_lock2cl(ecl); - - cl_lock_release(env, clk); -} - static struct lu_device *echo_device_free(const struct lu_env *env, struct lu_device *d) { @@ -934,193 +696,6 @@ static int cl_echo_object_put(struct echo_object *eco) return 0; } -static int __cl_echo_enqueue(struct lu_env *env, struct echo_object *eco, - u64 start, u64 end, int mode, - u64 *cookie, u32 enqflags) -{ - struct cl_io *io; - struct cl_lock *lck; - struct cl_object *obj; - struct cl_lock_descr *descr; - struct echo_thread_info *info; - int rc = -ENOMEM; - - info = echo_env_info(env); - io = &info->eti_io; - lck = &info->eti_lock; - obj = echo_obj2cl(eco); - - memset(lck, 0, sizeof(*lck)); - descr = &lck->cll_descr; - descr->cld_obj = obj; - descr->cld_start = cl_index(obj, start); - descr->cld_end = cl_index(obj, end); - descr->cld_mode = mode == LCK_PW ? CLM_WRITE : CLM_READ; - descr->cld_enq_flags = enqflags; - io->ci_obj = obj; - - rc = cl_lock_request(env, io, lck); - if (rc == 0) { - struct echo_client_obd *ec = eco->eo_dev->ed_ec; - struct echo_lock *el; - - el = cl2echo_lock(cl_lock_at(lck, &echo_device_type)); - spin_lock(&ec->ec_lock); - if (list_empty(&el->el_chain)) { - list_add(&el->el_chain, &ec->ec_locks); - el->el_cookie = ++ec->ec_unique; - } - atomic_inc(&el->el_refcount); - *cookie = el->el_cookie; - spin_unlock(&ec->ec_lock); - } - return rc; -} - -static int __cl_echo_cancel(struct lu_env *env, struct echo_device *ed, - u64 cookie) -{ - struct echo_client_obd *ec = ed->ed_ec; - struct echo_lock *ecl = NULL; - int found = 0, still_used = 0; - - spin_lock(&ec->ec_lock); - list_for_each_entry(ecl, &ec->ec_locks, el_chain) { - CDEBUG(D_INFO, "ecl: %p, cookie: %#llx\n", ecl, ecl->el_cookie); - found = (ecl->el_cookie == cookie); - if (found) { - if (atomic_dec_and_test(&ecl->el_refcount)) - list_del_init(&ecl->el_chain); - else - still_used = 1; - break; - } - } - spin_unlock(&ec->ec_lock); - - if (!found) - return -ENOENT; - - echo_lock_release(env, ecl, still_used); - return 0; -} - -static void echo_commit_callback(const struct lu_env *env, struct cl_io *io, - struct pagevec *pvec) -{ - struct echo_thread_info *info; - struct cl_2queue *queue; - int i = 0; - - info = echo_env_info(env); - LASSERT(io == &info->eti_io); - - queue = &info->eti_queue; - - for (i = 0; i < pagevec_count(pvec); i++) { - struct page *vmpage = pvec->pages[i]; - struct cl_page *page = (struct cl_page *)vmpage->private; - - cl_page_list_add(&queue->c2_qout, page, true); - } -} - -static int cl_echo_object_brw(struct echo_object *eco, int rw, u64 offset, - struct page **pages, int npages, int async) -{ - struct lu_env *env; - struct echo_thread_info *info; - struct cl_object *obj = echo_obj2cl(eco); - struct echo_device *ed = eco->eo_dev; - struct cl_2queue *queue; - struct cl_io *io; - struct cl_page *clp; - struct lustre_handle lh = { 0 }; - size_t page_size = cl_page_size(obj); - u16 refcheck; - int rc; - int i; - - LASSERT((offset & ~PAGE_MASK) == 0); - LASSERT(ed->ed_next); - env = cl_env_get(&refcheck); - if (IS_ERR(env)) - return PTR_ERR(env); - - info = echo_env_info(env); - io = &info->eti_io; - queue = &info->eti_queue; - - cl_2queue_init(queue); - - io->ci_ignore_layout = 1; - rc = cl_io_init(env, io, CIT_MISC, obj); - if (rc < 0) - goto out; - LASSERT(rc == 0); - - rc = __cl_echo_enqueue(env, eco, offset, - offset + npages * PAGE_SIZE - 1, - rw == READ ? LCK_PR : LCK_PW, &lh.cookie, - CEF_NEVER); - if (rc < 0) - goto error_lock; - - for (i = 0; i < npages; i++) { - LASSERT(pages[i]); - clp = cl_page_find(env, obj, cl_index(obj, offset), - pages[i], CPT_TRANSIENT); - if (IS_ERR(clp)) { - rc = PTR_ERR(clp); - break; - } - LASSERT(clp->cp_type == CPT_TRANSIENT); - - rc = cl_page_own(env, io, clp); - if (rc) { - LASSERT(clp->cp_state == CPS_FREEING); - cl_page_put(env, clp); - break; - } - /* - * Add a page to the incoming page list of 2-queue. - */ - cl_page_list_add(&queue->c2_qin, clp, true); - - /* drop the reference count for cl_page_find, so that the page - * will be freed in cl_2queue_fini. - */ - cl_page_put(env, clp); - cl_page_clip(env, clp, 0, page_size); - - offset += page_size; - } - - if (rc == 0) { - enum cl_req_type typ = rw == READ ? CRT_READ : CRT_WRITE; - - async = async && (typ == CRT_WRITE); - if (async) - rc = cl_io_commit_async(env, io, &queue->c2_qin, - 0, PAGE_SIZE, - echo_commit_callback); - else - rc = cl_io_submit_sync(env, io, typ, queue, 0); - CDEBUG(D_INFO, "echo_client %s write returns %d\n", - async ? "async" : "sync", rc); - } - - __cl_echo_cancel(env, ed, lh.cookie); -error_lock: - cl_2queue_discard(env, io, queue); - cl_2queue_disown(env, io, queue); - cl_2queue_fini(env, queue); - cl_io_fini(env, io); -out: - cl_env_put(env, &refcheck); - return rc; -} - /** @} echo_exports */ static u64 last_object_id; @@ -1264,101 +839,6 @@ static int echo_client_page_debug_check(struct page *page, u64 id, return rc; } -static int echo_client_kbrw(struct echo_device *ed, int rw, struct obdo *oa, - struct echo_object *eco, u64 offset, - u64 count, int async) -{ - u32 npages; - struct brw_page *pga; - struct brw_page *pgp; - struct page **pages; - u64 off; - int i; - int rc; - int verify; - gfp_t gfp_mask; - int brw_flags = 0; - - verify = (ostid_id(&oa->o_oi) != ECHO_PERSISTENT_OBJID && - (oa->o_valid & OBD_MD_FLFLAGS) != 0 && - (oa->o_flags & OBD_FL_DEBUG_CHECK) != 0); - - gfp_mask = ((ostid_id(&oa->o_oi) & 2) == 0) ? GFP_KERNEL : GFP_HIGHUSER; - - LASSERT(rw == OBD_BRW_WRITE || rw == OBD_BRW_READ); - - if (count <= 0 || - (count & (~PAGE_MASK)) != 0) - return -EINVAL; - - /* XXX think again with misaligned I/O */ - npages = count >> PAGE_SHIFT; - - if (rw == OBD_BRW_WRITE) - brw_flags = OBD_BRW_ASYNC; - - pga = kcalloc(npages, sizeof(*pga), GFP_NOFS); - if (!pga) - return -ENOMEM; - - pages = kvmalloc_array(npages, sizeof(*pages), - GFP_KERNEL | __GFP_ZERO); - if (!pages) { - kfree(pga); - return -ENOMEM; - } - - for (i = 0, pgp = pga, off = offset; - i < npages; - i++, pgp++, off += PAGE_SIZE) { - LASSERT(!pgp->pg); /* for cleanup */ - - rc = -ENOMEM; - pgp->pg = alloc_page(gfp_mask); - if (!pgp->pg) - goto out; - - /* set mapping so page is not considered encrypted */ - pgp->pg->mapping = ECHO_MAPPING_UNENCRYPTED; - pages[i] = pgp->pg; - pgp->count = PAGE_SIZE; - pgp->off = off; - pgp->flag = brw_flags; - - if (verify) - echo_client_page_debug_setup(pgp->pg, rw, - ostid_id(&oa->o_oi), off, - pgp->count); - } - - /* brw mode can only be used at client */ - LASSERT(ed->ed_next); - rc = cl_echo_object_brw(eco, rw, offset, pages, npages, async); - -out: - if (rc != 0 || rw != OBD_BRW_READ) - verify = 0; - - for (i = 0, pgp = pga; i < npages; i++, pgp++) { - if (!pgp->pg) - continue; - - if (verify) { - int vrc; - - vrc = echo_client_page_debug_check(pgp->pg, - ostid_id(&oa->o_oi), - pgp->off, pgp->count); - if (vrc != 0 && rc == 0) - rc = vrc; - } - __free_page(pgp->pg); - } - kfree(pga); - kvfree(pages); - return rc; -} - static int echo_client_prep_commit(const struct lu_env *env, struct obd_export *exp, int rw, struct obdo *oa, struct echo_object *eco, @@ -1491,12 +971,6 @@ static int echo_client_brw_ioctl(const struct lu_env *env, int rw, data->ioc_plen1 = PTLRPC_MAX_BRW_SIZE; switch (test_mode) { - case 1: - /* fall through */ - case 2: - rc = echo_client_kbrw(ed, rw, oa, eco, data->ioc_offset, - data->ioc_count, async); - break; case 3: rc = echo_client_prep_commit(env, ec->ec_exp, rw, oa, eco, data->ioc_offset, data->ioc_count, From patchwork Thu Aug 4 01:37:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 44771C19F29 for ; Thu, 4 Aug 2022 01:38:39 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyrwY12k0z23Jn; Wed, 3 Aug 2022 18:38:37 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwN2XjYz21H0 for ; Wed, 3 Aug 2022 18:38:28 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id AFBE8100AFF4; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id A6A928D626; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:49 -0400 Message-Id: <1659577097-19253-5-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 04/32] lustre: clio: remove cl_page_export() and cl_page_is_vmlocked() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: "John L. Hammond" Remove cl_page_export() and cl_page_is_vmlocked(), replacing them with direct calls to PageSetUptodate() and PageLoecked(). WC-bug-id: https://jira.whamcloud.com/browse/LU-10994 Lustre-commit: 3d52a7c5753e80e78 ("LU-10994 clio: remove cl_page_export() and cl_page_is_vmlocked()") Signed-off-by: John L. Hammond Reviewed-on: https://review.whamcloud.com/47241 Reviewed-by: Patrick Farrell Reviewed-by: Bobi Jam Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 22 --------------------- fs/lustre/llite/file.c | 2 +- fs/lustre/llite/rw.c | 4 ++-- fs/lustre/llite/vvp_page.c | 31 +---------------------------- fs/lustre/lov/lov_page.c | 2 +- fs/lustre/obdclass/cl_io.c | 1 - fs/lustre/obdclass/cl_page.c | 46 ------------------------------------------- fs/lustre/osc/osc_lock.c | 2 +- 8 files changed, 6 insertions(+), 104 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index 5be89d6..db5f610 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -872,26 +872,6 @@ struct cl_page_operations { const struct cl_page_slice *slice, struct cl_io *io); /** - * Announces whether the page contains valid data or not by @uptodate. - * - * \see cl_page_export() - * \see vvp_page_export() - */ - void (*cpo_export)(const struct lu_env *env, - const struct cl_page_slice *slice, int uptodate); - /** - * Checks whether underlying VM page is locked (in the suitable - * sense). Used for assertions. - * - * Return: -EBUSY means page is protected by a lock of a given - * mode; - * -ENODATA when page is not protected by a lock; - * 0 this layer cannot decide. (Should never happen.) - */ - int (*cpo_is_vmlocked)(const struct lu_env *env, - const struct cl_page_slice *slice); - - /** * Update file attributes when all we have is this page. Used for tiny * writes to update attributes when we don't have a full cl_io. */ @@ -2346,10 +2326,8 @@ int cl_page_flush(const struct lu_env *env, struct cl_io *io, void cl_page_discard(const struct lu_env *env, struct cl_io *io, struct cl_page *pg); void cl_page_delete(const struct lu_env *env, struct cl_page *pg); -int cl_page_is_vmlocked(const struct lu_env *env, const struct cl_page *pg); void cl_page_touch(const struct lu_env *env, const struct cl_page *pg, size_t to); -void cl_page_export(const struct lu_env *env, struct cl_page *pg, int uptodate); loff_t cl_offset(const struct cl_object *obj, pgoff_t idx); pgoff_t cl_index(const struct cl_object *obj, loff_t offset); size_t cl_page_size(const struct cl_object *obj); diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index efe117d..0e71b3a 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -587,7 +587,7 @@ void ll_dom_finish_open(struct inode *inode, struct ptlrpc_request *req) put_page(vmpage); break; } - cl_page_export(env, page, 1); + SetPageUptodate(vmpage); cl_page_put(env, page); unlock_page(vmpage); put_page(vmpage); diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c index c807217..478ef1b 100644 --- a/fs/lustre/llite/rw.c +++ b/fs/lustre/llite/rw.c @@ -1664,7 +1664,7 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io, cl_2queue_init(queue); if (uptodate) { vpg->vpg_ra_used = 1; - cl_page_export(env, page, 1); + SetPageUptodate(page->cp_vmpage); cl_page_disown(env, io, page); } else { anchor = &vvp_env_info(env)->vti_anchor; @@ -1908,7 +1908,7 @@ int ll_readpage(struct file *file, struct page *vmpage) /* export the page and skip io stack */ if (result == 0) { vpg->vpg_ra_used = 1; - cl_page_export(env, page, 1); + SetPageUptodate(vmpage); } else { ll_ra_stats_inc_sbi(sbi, RA_STAT_FAILED_FAST_READ); } diff --git a/fs/lustre/llite/vvp_page.c b/fs/lustre/llite/vvp_page.c index 7744e9b..82ce5ab 100644 --- a/fs/lustre/llite/vvp_page.c +++ b/fs/lustre/llite/vvp_page.c @@ -170,26 +170,6 @@ static void vvp_page_delete(const struct lu_env *env, */ } -static void vvp_page_export(const struct lu_env *env, - const struct cl_page_slice *slice, - int uptodate) -{ - struct page *vmpage = cl2vm_page(slice); - - LASSERT(vmpage); - LASSERT(PageLocked(vmpage)); - if (uptodate) - SetPageUptodate(vmpage); - else - ClearPageUptodate(vmpage); -} - -static int vvp_page_is_vmlocked(const struct lu_env *env, - const struct cl_page_slice *slice) -{ - return PageLocked(cl2vm_page(slice)) ? -EBUSY : -ENODATA; -} - static int vvp_page_prep_read(const struct lu_env *env, const struct cl_page_slice *slice, struct cl_io *unused) @@ -260,7 +240,7 @@ static void vvp_page_completion_read(const struct lu_env *env, if (ioret == 0) { if (!vpg->vpg_defer_uptodate) - cl_page_export(env, page, 1); + SetPageUptodate(vmpage); } else if (vpg->vpg_defer_uptodate) { vpg->vpg_defer_uptodate = 0; if (ioret == -EAGAIN) { @@ -382,8 +362,6 @@ static int vvp_page_fail(const struct lu_env *env, .cpo_disown = vvp_page_disown, .cpo_discard = vvp_page_discard, .cpo_delete = vvp_page_delete, - .cpo_export = vvp_page_export, - .cpo_is_vmlocked = vvp_page_is_vmlocked, .cpo_fini = vvp_page_fini, .cpo_print = vvp_page_print, .io = { @@ -412,15 +390,8 @@ static void vvp_transient_page_discard(const struct lu_env *env, cl_page_delete(env, page); } -static int vvp_transient_page_is_vmlocked(const struct lu_env *env, - const struct cl_page_slice *slice) -{ - return -EBUSY; -} - static const struct cl_page_operations vvp_transient_page_ops = { .cpo_discard = vvp_transient_page_discard, - .cpo_is_vmlocked = vvp_transient_page_is_vmlocked, .cpo_print = vvp_page_print, }; diff --git a/fs/lustre/lov/lov_page.c b/fs/lustre/lov/lov_page.c index bd6ba79..a22b71f 100644 --- a/fs/lustre/lov/lov_page.c +++ b/fs/lustre/lov/lov_page.c @@ -144,7 +144,7 @@ int lov_page_init_empty(const struct lu_env *env, struct cl_object *obj, addr = kmap(page->cp_vmpage); memset(addr, 0, cl_page_size(obj)); kunmap(page->cp_vmpage); - cl_page_export(env, page, 1); + SetPageUptodate(page->cp_vmpage); return 0; } diff --git a/fs/lustre/obdclass/cl_io.c b/fs/lustre/obdclass/cl_io.c index 6dd029a..c97ac0f 100644 --- a/fs/lustre/obdclass/cl_io.c +++ b/fs/lustre/obdclass/cl_io.c @@ -849,7 +849,6 @@ void cl_page_list_del(const struct lu_env *env, struct cl_page_list *plist, struct cl_page *page) { LASSERT(plist->pl_nr > 0); - LASSERT(cl_page_is_vmlocked(env, page)); list_del_init(&page->cp_batch); --plist->pl_nr; diff --git a/fs/lustre/obdclass/cl_page.c b/fs/lustre/obdclass/cl_page.c index 9326743..b5b5448 100644 --- a/fs/lustre/obdclass/cl_page.c +++ b/fs/lustre/obdclass/cl_page.c @@ -760,52 +760,6 @@ void cl_page_delete(const struct lu_env *env, struct cl_page *pg) } EXPORT_SYMBOL(cl_page_delete); -/** - * Marks page up-to-date. - * - * Call cl_page_operations::cpo_export() through all layers top-to-bottom. The - * layer responsible for VM interaction has to mark/clear page as up-to-date - * by the @uptodate argument. - * - * \see cl_page_operations::cpo_export() - */ -void cl_page_export(const struct lu_env *env, struct cl_page *cl_page, - int uptodate) -{ - const struct cl_page_slice *slice; - int i; - - cl_page_slice_for_each(cl_page, slice, i) { - if (slice->cpl_ops->cpo_export) - (*slice->cpl_ops->cpo_export)(env, slice, uptodate); - } -} -EXPORT_SYMBOL(cl_page_export); - -/** - * Returns true, if @cl_page is VM locked in a suitable sense by the calling - * thread. - */ -int cl_page_is_vmlocked(const struct lu_env *env, - const struct cl_page *cl_page) -{ - const struct cl_page_slice *slice; - int result; - - slice = cl_page_slice_get(cl_page, 0); - PASSERT(env, cl_page, slice->cpl_ops->cpo_is_vmlocked); - /* - * Call ->cpo_is_vmlocked() directly instead of going through - * CL_PAGE_INVOKE(), because cl_page_is_vmlocked() is used by - * cl_page_invariant(). - */ - result = slice->cpl_ops->cpo_is_vmlocked(env, slice); - PASSERT(env, cl_page, result == -EBUSY || result == -ENODATA); - - return result == -EBUSY; -} -EXPORT_SYMBOL(cl_page_is_vmlocked); - void cl_page_touch(const struct lu_env *env, const struct cl_page *cl_page, size_t to) { diff --git a/fs/lustre/osc/osc_lock.c b/fs/lustre/osc/osc_lock.c index eb3cb58..c8f8502 100644 --- a/fs/lustre/osc/osc_lock.c +++ b/fs/lustre/osc/osc_lock.c @@ -644,7 +644,7 @@ static bool weigh_cb(const struct lu_env *env, struct cl_io *io, struct osc_page *ops = pvec[i]; struct cl_page *page = ops->ops_cl.cpl_page; - if (cl_page_is_vmlocked(env, page) || + if (PageLocked(page->cp_vmpage) || PageDirty(page->cp_vmpage) || PageWriteback(page->cp_vmpage)) return false; From patchwork Thu Aug 4 01:37:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E05C0C19F29 for ; Thu, 4 Aug 2022 01:38:46 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrwk36GXz23Jr; Wed, 3 Aug 2022 18:38:46 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwP6FRwz23Hk for ; Wed, 3 Aug 2022 18:38:29 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id B4F1A100AFF6; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id AA0EA905FD; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:50 -0400 Message-Id: <1659577097-19253-6-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 05/32] lustre: clio: remove cpo_own and cpo_disown X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: "John L. Hammond" Remove the cpo_own and cpo_disown methods from struct cl_page_operations. These methods were only implemented by the vvp layer so they can be inlined into cl_page_own0() and cl_page_disown(). Move most of vvp_page_discard() and all of vvp_transient_page_discard() into cl_page_discard(). WC-bug-id: https://jira.whamcloud.com/browse/LU-10994 Lustre-commit: 81c6dc423ce4c62a6 ("LU-10994 clio: remove cpo_own and cpo_disown") Signed-off-by: John L. Hammond Reviewed-on: https://review.whamcloud.com/47372 Reviewed-by: Patrick Farrell Reviewed-by: Bobi Jam Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 33 ++++---------- fs/lustre/llite/llite_lib.c | 2 +- fs/lustre/llite/rw.c | 49 +++++++++++---------- fs/lustre/llite/rw26.c | 6 +-- fs/lustre/llite/vvp_dev.c | 3 +- fs/lustre/llite/vvp_internal.h | 3 -- fs/lustre/llite/vvp_page.c | 86 ++++++------------------------------ fs/lustre/obdclass/cl_io.c | 12 +++-- fs/lustre/obdclass/cl_page.c | 99 +++++++++++++++++++++++------------------- 9 files changed, 112 insertions(+), 181 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index db5f610..4460ae1 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -764,6 +764,10 @@ struct cl_page { * creation. */ enum cl_page_type cp_type:CP_TYPE_BITS; + unsigned int cp_defer_uptodate:1, + cp_ra_updated:1, + cp_ra_used:1; + /* which slab kmem index this memory allocated from */ short int cp_kmem_index; @@ -822,7 +826,7 @@ enum cl_req_type { * * Methods taking an @io argument are for the activity happening in the * context of given @io. Page is assumed to be owned by that io, except for - * the obvious cases (like cl_page_operations::cpo_own()). + * the obvious cases. * * \see vvp_page_ops, lov_page_ops, osc_page_ops */ @@ -834,25 +838,6 @@ struct cl_page_operations { */ /** - * Called when @io acquires this page into the exclusive - * ownership. When this method returns, it is guaranteed that the is - * not owned by other io, and no transfer is going on against - * it. Optional. - * - * \see cl_page_own() - * \see vvp_page_own(), lov_page_own() - */ - int (*cpo_own)(const struct lu_env *env, - const struct cl_page_slice *slice, - struct cl_io *io, int nonblock); - /** Called when ownership it yielded. Optional. - * - * \see cl_page_disown() - * \see vvp_page_disown() - */ - void (*cpo_disown)(const struct lu_env *env, - const struct cl_page_slice *slice, struct cl_io *io); - /** * Called for a page that is already "owned" by @io from VM point of * view. Optional. * @@ -2290,8 +2275,7 @@ void cl_page_unassume(const struct lu_env *env, struct cl_io *io, struct cl_page *pg); void cl_page_disown(const struct lu_env *env, struct cl_io *io, struct cl_page *page); -void __cl_page_disown(const struct lu_env *env, - struct cl_io *io, struct cl_page *pg); +void __cl_page_disown(const struct lu_env *env, struct cl_page *pg); int cl_page_is_owned(const struct cl_page *pg, const struct cl_io *io); /** @} ownership */ @@ -2544,14 +2528,13 @@ void cl_page_list_splice(struct cl_page_list *list, void cl_page_list_del(const struct lu_env *env, struct cl_page_list *plist, struct cl_page *page); void cl_page_list_disown(const struct lu_env *env, - struct cl_io *io, struct cl_page_list *plist); + struct cl_page_list *plist); void cl_page_list_discard(const struct lu_env *env, struct cl_io *io, struct cl_page_list *plist); void cl_page_list_fini(const struct lu_env *env, struct cl_page_list *plist); void cl_2queue_init(struct cl_2queue *queue); -void cl_2queue_disown(const struct lu_env *env, struct cl_io *io, - struct cl_2queue *queue); +void cl_2queue_disown(const struct lu_env *env, struct cl_2queue *queue); void cl_2queue_discard(const struct lu_env *env, struct cl_io *io, struct cl_2queue *queue); void cl_2queue_fini(const struct lu_env *env, struct cl_2queue *queue); diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 6adbf10..b55a30f 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -1955,7 +1955,7 @@ int ll_io_zero_page(struct inode *inode, pgoff_t index, pgoff_t offset, queuefini2: cl_2queue_discard(env, io, queue); queuefini1: - cl_2queue_disown(env, io, queue); + cl_2queue_disown(env, queue); cl_2queue_fini(env, queue); } diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c index 478ef1b..7c4b8ec 100644 --- a/fs/lustre/llite/rw.c +++ b/fs/lustre/llite/rw.c @@ -195,10 +195,9 @@ static int ll_read_ahead_page(const struct lu_env *env, struct cl_io *io, enum ra_stat which = _NR_RA_STAT; /* keep gcc happy */ struct cl_object *clob = io->ci_obj; struct inode *inode = vvp_object_inode(clob); - const char *msg = NULL; - struct cl_page *page; - struct vvp_page *vpg; struct page *vmpage = NULL; + const char *msg = NULL; + struct cl_page *cp; int rc = 0; switch (hint) { @@ -233,34 +232,35 @@ static int ll_read_ahead_page(const struct lu_env *env, struct cl_io *io, goto out; } - page = cl_page_find(env, clob, vmpage->index, vmpage, CPT_CACHEABLE); - if (IS_ERR(page)) { + cp = cl_page_find(env, clob, vmpage->index, vmpage, CPT_CACHEABLE); + if (IS_ERR(cp)) { which = RA_STAT_FAILED_GRAB_PAGE; msg = "cl_page_find failed"; - rc = PTR_ERR(page); + rc = PTR_ERR(cp); goto out; } - lu_ref_add(&page->cp_reference, "ra", current); - cl_page_assume(env, io, page); - vpg = cl2vvp_page(cl_object_page_slice(clob, page)); - if (!vpg->vpg_defer_uptodate && !PageUptodate(vmpage)) { + lu_ref_add(&cp->cp_reference, "ra", current); + cl_page_assume(env, io, cp); + + if (!cp->cp_defer_uptodate && !PageUptodate(vmpage)) { if (hint == MAYNEED) { - vpg->vpg_defer_uptodate = 1; - vpg->vpg_ra_used = 0; + cp->cp_defer_uptodate = 1; + cp->cp_ra_used = 0; } - cl_page_list_add(queue, page, true); + + cl_page_list_add(queue, cp, true); } else { /* skip completed pages */ - cl_page_unassume(env, io, page); + cl_page_unassume(env, io, cp); /* This page is already uptodate, returning a positive number * to tell the callers about this */ rc = 1; } - lu_ref_del(&page->cp_reference, "ra", current); - cl_page_put(env, page); + lu_ref_del(&cp->cp_reference, "ra", current); + cl_page_put(env, cp); out: if (vmpage) { if (rc) @@ -695,7 +695,7 @@ static void ll_readahead_handle_work(struct work_struct *wq) cl_page_list_discard(env, io, &queue->c2_qin); /* Unlock unsent read pages in case of error. */ - cl_page_list_disown(env, io, &queue->c2_qin); + cl_page_list_disown(env, &queue->c2_qin); cl_2queue_fini(env, queue); out_io_fini: @@ -1649,9 +1649,9 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io, unlockpage = false; vpg = cl2vvp_page(cl_object_page_slice(page->cp_obj, page)); - uptodate = vpg->vpg_defer_uptodate; + uptodate = page->cp_defer_uptodate; - if (ll_readahead_enabled(sbi) && !vpg->vpg_ra_updated && ras) { + if (ll_readahead_enabled(sbi) && !page->cp_ra_updated && ras) { enum ras_update_flags flags = 0; if (uptodate) @@ -1663,7 +1663,7 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io, cl_2queue_init(queue); if (uptodate) { - vpg->vpg_ra_used = 1; + page->cp_ra_used = 1; SetPageUptodate(page->cp_vmpage); cl_page_disown(env, io, page); } else { @@ -1740,7 +1740,7 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io, cl_page_list_discard(env, io, &queue->c2_qin); /* Unlock unsent read pages in case of error. */ - cl_page_list_disown(env, io, &queue->c2_qin); + cl_page_list_disown(env, &queue->c2_qin); cl_2queue_fini(env, queue); @@ -1881,7 +1881,7 @@ int ll_readpage(struct file *file, struct page *vmpage) } vpg = cl2vvp_page(cl_object_page_slice(page->cp_obj, page)); - if (vpg->vpg_defer_uptodate) { + if (page->cp_defer_uptodate) { enum ras_update_flags flags = LL_RAS_HIT; if (lcc && lcc->lcc_type == LCC_MMAP) @@ -1894,7 +1894,7 @@ int ll_readpage(struct file *file, struct page *vmpage) */ ras_update(sbi, inode, ras, vvp_index(vpg), flags, io); /* avoid duplicate ras_update() call */ - vpg->vpg_ra_updated = 1; + page->cp_ra_updated = 1; if (ll_use_fast_io(file, ras, vvp_index(vpg))) result = 0; @@ -1907,11 +1907,12 @@ int ll_readpage(struct file *file, struct page *vmpage) /* export the page and skip io stack */ if (result == 0) { - vpg->vpg_ra_used = 1; + page->cp_ra_used = 1; SetPageUptodate(vmpage); } else { ll_ra_stats_inc_sbi(sbi, RA_STAT_FAILED_FAST_READ); } + /* release page refcount before unlocking the page to ensure * the object won't be destroyed in the calling path of * cl_page_put(). Please see comment in ll_releasepage(). diff --git a/fs/lustre/llite/rw26.c b/fs/lustre/llite/rw26.c index 8b379ca..7147f0f 100644 --- a/fs/lustre/llite/rw26.c +++ b/fs/lustre/llite/rw26.c @@ -286,7 +286,7 @@ static unsigned long ll_iov_iter_alignment(struct iov_iter *i) } cl_2queue_discard(env, io, queue); - cl_2queue_disown(env, io, queue); + cl_2queue_disown(env, queue); cl_2queue_fini(env, queue); return rc; } @@ -468,8 +468,8 @@ static int ll_prepare_partial_page(const struct lu_env *env, struct cl_io *io, goto out; } - if (vpg->vpg_defer_uptodate) { - vpg->vpg_ra_used = 1; + if (pg->cp_defer_uptodate) { + pg->cp_ra_used = 1; result = 0; goto out; } diff --git a/fs/lustre/llite/vvp_dev.c b/fs/lustre/llite/vvp_dev.c index 0c417d8..99335bd 100644 --- a/fs/lustre/llite/vvp_dev.c +++ b/fs/lustre/llite/vvp_dev.c @@ -435,11 +435,10 @@ static void vvp_pgcache_page_show(const struct lu_env *env, vpg = cl2vvp_page(cl_page_at(page, &vvp_device_type)); vmpage = vpg->vpg_page; - seq_printf(seq, " %5i | %p %p %s %s %s | %p " DFID "(%p) %lu %u [", + seq_printf(seq, " %5i | %p %p %s %s | %p " DFID "(%p) %lu %u [", 0 /* gen */, vpg, page, "none", - vpg->vpg_defer_uptodate ? "du" : "- ", PageWriteback(vmpage) ? "wb" : "-", vmpage, PFID(ll_inode2fid(vmpage->mapping->host)), vmpage->mapping->host, vmpage->index, diff --git a/fs/lustre/llite/vvp_internal.h b/fs/lustre/llite/vvp_internal.h index b5e1df2..0e0da76 100644 --- a/fs/lustre/llite/vvp_internal.h +++ b/fs/lustre/llite/vvp_internal.h @@ -213,9 +213,6 @@ struct vvp_object { */ struct vvp_page { struct cl_page_slice vpg_cl; - unsigned int vpg_defer_uptodate:1, - vpg_ra_updated:1, - vpg_ra_used:1; /** VM page */ struct page *vpg_page; }; diff --git a/fs/lustre/llite/vvp_page.c b/fs/lustre/llite/vvp_page.c index 82ce5ab..8875a62 100644 --- a/fs/lustre/llite/vvp_page.c +++ b/fs/lustre/llite/vvp_page.c @@ -73,32 +73,6 @@ static void vvp_page_fini(const struct lu_env *env, } } -static int vvp_page_own(const struct lu_env *env, - const struct cl_page_slice *slice, struct cl_io *io, - int nonblock) -{ - struct vvp_page *vpg = cl2vvp_page(slice); - struct page *vmpage = vpg->vpg_page; - - LASSERT(vmpage); - if (nonblock) { - if (!trylock_page(vmpage)) - return -EAGAIN; - - if (unlikely(PageWriteback(vmpage))) { - unlock_page(vmpage); - return -EAGAIN; - } - - return 0; - } - - lock_page(vmpage); - wait_on_page_writeback(vmpage); - - return 0; -} - static void vvp_page_assume(const struct lu_env *env, const struct cl_page_slice *slice, struct cl_io *unused) @@ -120,31 +94,15 @@ static void vvp_page_unassume(const struct lu_env *env, LASSERT(PageLocked(vmpage)); } -static void vvp_page_disown(const struct lu_env *env, - const struct cl_page_slice *slice, struct cl_io *io) -{ - struct page *vmpage = cl2vm_page(slice); - - LASSERT(vmpage); - LASSERT(PageLocked(vmpage)); - - unlock_page(cl2vm_page(slice)); -} - static void vvp_page_discard(const struct lu_env *env, const struct cl_page_slice *slice, struct cl_io *unused) { - struct page *vmpage = cl2vm_page(slice); - struct vvp_page *vpg = cl2vvp_page(slice); + struct cl_page *cp = slice->cpl_page; + struct page *vmpage = cp->cp_vmpage; - LASSERT(vmpage); - LASSERT(PageLocked(vmpage)); - - if (vpg->vpg_defer_uptodate && !vpg->vpg_ra_used && vmpage->mapping) + if (cp->cp_defer_uptodate && !cp->cp_ra_used && vmpage->mapping) ll_ra_stats_inc(vmpage->mapping->host, RA_STAT_DISCARDED); - - generic_error_remove_page(vmpage->mapping, vmpage); } static void vvp_page_delete(const struct lu_env *env, @@ -227,22 +185,21 @@ static void vvp_page_completion_read(const struct lu_env *env, const struct cl_page_slice *slice, int ioret) { - struct vvp_page *vpg = cl2vvp_page(slice); - struct page *vmpage = vpg->vpg_page; - struct cl_page *page = slice->cpl_page; - struct inode *inode = vvp_object_inode(page->cp_obj); + struct cl_page *cp = slice->cpl_page; + struct page *vmpage = cp->cp_vmpage; + struct inode *inode = vvp_object_inode(cp->cp_obj); LASSERT(PageLocked(vmpage)); - CL_PAGE_HEADER(D_PAGE, env, page, "completing READ with %d\n", ioret); + CL_PAGE_HEADER(D_PAGE, env, cp, "completing READ with %d\n", ioret); - if (vpg->vpg_defer_uptodate) + if (cp->cp_defer_uptodate) ll_ra_count_put(ll_i2sbi(inode), 1); if (ioret == 0) { - if (!vpg->vpg_defer_uptodate) + if (!cp->cp_defer_uptodate) SetPageUptodate(vmpage); - } else if (vpg->vpg_defer_uptodate) { - vpg->vpg_defer_uptodate = 0; + } else if (cp->cp_defer_uptodate) { + cp->cp_defer_uptodate = 0; if (ioret == -EAGAIN) { /* mirror read failed, it needs to destroy the page * because subpage would be from wrong osc when trying @@ -252,7 +209,7 @@ static void vvp_page_completion_read(const struct lu_env *env, } } - if (!page->cp_sync_io) + if (!cp->cp_sync_io) unlock_page(vmpage); } @@ -329,8 +286,8 @@ static int vvp_page_print(const struct lu_env *env, struct vvp_page *vpg = cl2vvp_page(slice); struct page *vmpage = vpg->vpg_page; - (*printer)(env, cookie, LUSTRE_VVP_NAME "-page@%p(%d:%d) vm@%p ", - vpg, vpg->vpg_defer_uptodate, vpg->vpg_ra_used, vmpage); + (*printer)(env, cookie, + LUSTRE_VVP_NAME"-page@%p vm@%p ", vpg, vmpage); if (vmpage) { (*printer)(env, cookie, "%lx %d:%d %lx %lu %slru", (long)vmpage->flags, page_count(vmpage), @@ -356,10 +313,8 @@ static int vvp_page_fail(const struct lu_env *env, } static const struct cl_page_operations vvp_page_ops = { - .cpo_own = vvp_page_own, .cpo_assume = vvp_page_assume, .cpo_unassume = vvp_page_unassume, - .cpo_disown = vvp_page_disown, .cpo_discard = vvp_page_discard, .cpo_delete = vvp_page_delete, .cpo_fini = vvp_page_fini, @@ -378,20 +333,7 @@ static int vvp_page_fail(const struct lu_env *env, }, }; -static void vvp_transient_page_discard(const struct lu_env *env, - const struct cl_page_slice *slice, - struct cl_io *unused) -{ - struct cl_page *page = slice->cpl_page; - - /* - * For transient pages, remove it from the radix tree. - */ - cl_page_delete(env, page); -} - static const struct cl_page_operations vvp_transient_page_ops = { - .cpo_discard = vvp_transient_page_discard, .cpo_print = vvp_page_print, }; diff --git a/fs/lustre/obdclass/cl_io.c b/fs/lustre/obdclass/cl_io.c index c97ac0f..4246e17 100644 --- a/fs/lustre/obdclass/cl_io.c +++ b/fs/lustre/obdclass/cl_io.c @@ -911,8 +911,7 @@ void cl_page_list_splice(struct cl_page_list *src, struct cl_page_list *dst) /** * Disowns pages in a queue. */ -void cl_page_list_disown(const struct lu_env *env, - struct cl_io *io, struct cl_page_list *plist) +void cl_page_list_disown(const struct lu_env *env, struct cl_page_list *plist) { struct cl_page *page; struct cl_page *temp; @@ -930,7 +929,7 @@ void cl_page_list_disown(const struct lu_env *env, /* * XXX __cl_page_disown() will fail if page is not locked. */ - __cl_page_disown(env, io, page); + __cl_page_disown(env, page); lu_ref_del_at(&page->cp_reference, &page->cp_queue_ref, "queue", plist); cl_page_put(env, page); @@ -990,11 +989,10 @@ void cl_2queue_init(struct cl_2queue *queue) /** * Disown pages in both lists of a 2-queue. */ -void cl_2queue_disown(const struct lu_env *env, - struct cl_io *io, struct cl_2queue *queue) +void cl_2queue_disown(const struct lu_env *env, struct cl_2queue *queue) { - cl_page_list_disown(env, io, &queue->c2_qin); - cl_page_list_disown(env, io, &queue->c2_qout); + cl_page_list_disown(env, &queue->c2_qin); + cl_page_list_disown(env, &queue->c2_qout); } EXPORT_SYMBOL(cl_2queue_disown); diff --git a/fs/lustre/obdclass/cl_page.c b/fs/lustre/obdclass/cl_page.c index b5b5448..cff2c54 100644 --- a/fs/lustre/obdclass/cl_page.c +++ b/fs/lustre/obdclass/cl_page.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include "cl_internal.h" @@ -487,26 +488,22 @@ static void cl_page_owner_set(struct cl_page *page) page->cp_owner->ci_owned_nr++; } -void __cl_page_disown(const struct lu_env *env, - struct cl_io *io, struct cl_page *cl_page) +void __cl_page_disown(const struct lu_env *env, struct cl_page *cp) { - const struct cl_page_slice *slice; enum cl_page_state state; - int i; + struct page *vmpage; - state = cl_page->cp_state; - cl_page_owner_clear(cl_page); + state = cp->cp_state; + cl_page_owner_clear(cp); if (state == CPS_OWNED) - cl_page_state_set(env, cl_page, CPS_CACHED); - /* - * Completion call-backs are executed in the bottom-up order, so that - * uppermost layer (llite), responsible for VFS/VM interaction runs - * last and can release locks safely. - */ - cl_page_slice_for_each_reverse(cl_page, slice, i) { - if (slice->cpl_ops->cpo_disown) - (*slice->cpl_ops->cpo_disown)(env, slice, io); + cl_page_state_set(env, cp, CPS_CACHED); + + if (cp->cp_type == CPT_CACHEABLE) { + vmpage = cp->cp_vmpage; + LASSERT(vmpage); + LASSERT(PageLocked(vmpage)); + unlock_page(vmpage); } } @@ -539,45 +536,51 @@ int cl_page_is_owned(const struct cl_page *pg, const struct cl_io *io) * another thread, or in IO. * * \see cl_page_disown() - * \see cl_page_operations::cpo_own() * \see cl_page_own_try() * \see cl_page_own */ static int __cl_page_own(const struct lu_env *env, struct cl_io *io, struct cl_page *cl_page, int nonblock) { - const struct cl_page_slice *slice; + struct page *vmpage = cl_page->cp_vmpage; int result = 0; - int i; - - io = cl_io_top(io); if (cl_page->cp_state == CPS_FREEING) { result = -ENOENT; goto out; } - cl_page_slice_for_each(cl_page, slice, i) { - if (slice->cpl_ops->cpo_own) - result = (*slice->cpl_ops->cpo_own)(env, slice, - io, nonblock); - if (result != 0) - break; - } - if (result > 0) - result = 0; + LASSERT(vmpage); - if (result == 0) { - PASSERT(env, cl_page, !cl_page->cp_owner); - cl_page->cp_owner = cl_io_top(io); - cl_page_owner_set(cl_page); - if (cl_page->cp_state != CPS_FREEING) { - cl_page_state_set(env, cl_page, CPS_OWNED); - } else { - __cl_page_disown(env, io, cl_page); - result = -ENOENT; + if (cl_page->cp_type == CPT_TRANSIENT) { + /* OK */ + } else if (nonblock) { + if (!trylock_page(vmpage)) { + result = -EAGAIN; + goto out; } + + if (unlikely(PageWriteback(vmpage))) { + unlock_page(vmpage); + result = -EAGAIN; + goto out; + } + } else { + lock_page(vmpage); + wait_on_page_writeback(vmpage); } + + PASSERT(env, cl_page, !cl_page->cp_owner); + cl_page->cp_owner = cl_io_top(io); + cl_page_owner_set(cl_page); + + if (cl_page->cp_state == CPS_FREEING) { + __cl_page_disown(env, cl_page); + result = -ENOENT; + goto out; + } + + cl_page_state_set(env, cl_page, CPS_OWNED); out: return result; } @@ -672,13 +675,11 @@ void cl_page_unassume(const struct lu_env *env, * \post !cl_page_is_owned(pg, io) * * \see cl_page_own() - * \see cl_page_operations::cpo_disown() */ void cl_page_disown(const struct lu_env *env, struct cl_io *io, struct cl_page *pg) { - io = cl_io_top(io); - __cl_page_disown(env, io, pg); + __cl_page_disown(env, pg); } EXPORT_SYMBOL(cl_page_disown); @@ -693,15 +694,25 @@ void cl_page_disown(const struct lu_env *env, * \see cl_page_operations::cpo_discard() */ void cl_page_discard(const struct lu_env *env, - struct cl_io *io, struct cl_page *cl_page) + struct cl_io *io, struct cl_page *cp) { const struct cl_page_slice *slice; + struct page *vmpage; int i; - cl_page_slice_for_each(cl_page, slice, i) { + cl_page_slice_for_each(cp, slice, i) { if (slice->cpl_ops->cpo_discard) (*slice->cpl_ops->cpo_discard)(env, slice, io); } + + if (cp->cp_type == CPT_CACHEABLE) { + vmpage = cp->cp_vmpage; + LASSERT(vmpage); + LASSERT(PageLocked(vmpage)); + generic_error_remove_page(vmpage->mapping, vmpage); + } else { + cl_page_delete(env, cp); + } } EXPORT_SYMBOL(cl_page_discard); @@ -813,7 +824,7 @@ int cl_page_prep(const struct lu_env *env, struct cl_io *io, if (cl_page->cp_type != CPT_TRANSIENT) { cl_page_slice_for_each(cl_page, slice, i) { - if (slice->cpl_ops->cpo_own) + if (slice->cpl_ops->io[crt].cpo_prep) result = (*slice->cpl_ops->io[crt].cpo_prep)(env, slice, io); From patchwork Thu Aug 4 01:37:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EA5A9C19F29 for ; Thu, 4 Aug 2022 01:38:42 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrwf3BkWz23K8; Wed, 3 Aug 2022 18:38:42 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwP0vxYz232N for ; Wed, 3 Aug 2022 18:38:29 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id B1685100AFF5; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id AD60E80795; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:51 -0400 Message-Id: <1659577097-19253-7-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 06/32] lustre: clio: remove cpo_assume, cpo_unassume, cpo_fini X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: "John L. Hammond" Remove the cl_page methods cpo_assume, cpo_unassume, and cpo_fini. These methods were only implemented by the vvp layer and so they can be easily inlined into cl_page_assume() and cl_page_unassume(). Remove vvp_page_delete() by inlining its contents to cl_page_delete0(). WC-bug-id: https://jira.whamcloud.com/browse/LU-10994 Lustre-commit: 9045894fe0f503333 ("LU-10994 clio: remove cpo_assume, cpo_unassume, cpo_fini") Signed-off-by: John L. Hammond Reviewed-on: https://review.whamcloud.com/47373 Reviewed-by: Patrick Farrell Reviewed-by: Bobi Jam Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 23 -------- fs/lustre/llite/vvp_page.c | 71 +------------------------ fs/lustre/obdclass/cl_page.c | 119 +++++++++++++++++++++++++----------------- 3 files changed, 73 insertions(+), 140 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index 4460ae1..c66e98c5 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -838,25 +838,6 @@ struct cl_page_operations { */ /** - * Called for a page that is already "owned" by @io from VM point of - * view. Optional. - * - * \see cl_page_assume() - * \see vvp_page_assume(), lov_page_assume() - */ - void (*cpo_assume)(const struct lu_env *env, - const struct cl_page_slice *slice, struct cl_io *io); - /** Dual to cl_page_operations::cpo_assume(). Optional. Called - * bottom-to-top when IO releases a page without actually unlocking - * it. - * - * \see cl_page_unassume() - * \see vvp_page_unassume() - */ - void (*cpo_unassume)(const struct lu_env *env, - const struct cl_page_slice *slice, - struct cl_io *io); - /** * Update file attributes when all we have is this page. Used for tiny * writes to update attributes when we don't have a full cl_io. */ @@ -884,10 +865,6 @@ struct cl_page_operations { */ void (*cpo_delete)(const struct lu_env *env, const struct cl_page_slice *slice); - /** Destructor. Frees resources and slice itself. */ - void (*cpo_fini)(const struct lu_env *env, - struct cl_page_slice *slice, - struct pagevec *pvec); /** * Optional debugging helper. Prints given page slice. * diff --git a/fs/lustre/llite/vvp_page.c b/fs/lustre/llite/vvp_page.c index 8875a62..db1cd7c1 100644 --- a/fs/lustre/llite/vvp_page.c +++ b/fs/lustre/llite/vvp_page.c @@ -52,48 +52,6 @@ * Page operations. * */ -static void vvp_page_fini(const struct lu_env *env, - struct cl_page_slice *slice, - struct pagevec *pvec) -{ - struct vvp_page *vpg = cl2vvp_page(slice); - struct page *vmpage = vpg->vpg_page; - - /* - * vmpage->private was already cleared when page was moved into - * VPG_FREEING state. - */ - LASSERT((struct cl_page *)vmpage->private != slice->cpl_page); - LASSERT(vmpage); - if (pvec) { - if (!pagevec_add(pvec, vmpage)) - pagevec_release(pvec); - } else { - put_page(vmpage); - } -} - -static void vvp_page_assume(const struct lu_env *env, - const struct cl_page_slice *slice, - struct cl_io *unused) -{ - struct page *vmpage = cl2vm_page(slice); - - LASSERT(vmpage); - LASSERT(PageLocked(vmpage)); - wait_on_page_writeback(vmpage); -} - -static void vvp_page_unassume(const struct lu_env *env, - const struct cl_page_slice *slice, - struct cl_io *unused) -{ - struct page *vmpage = cl2vm_page(slice); - - LASSERT(vmpage); - LASSERT(PageLocked(vmpage)); -} - static void vvp_page_discard(const struct lu_env *env, const struct cl_page_slice *slice, struct cl_io *unused) @@ -105,29 +63,6 @@ static void vvp_page_discard(const struct lu_env *env, ll_ra_stats_inc(vmpage->mapping->host, RA_STAT_DISCARDED); } -static void vvp_page_delete(const struct lu_env *env, - const struct cl_page_slice *slice) -{ - struct page *vmpage = cl2vm_page(slice); - struct cl_page *page = slice->cpl_page; - - LASSERT(PageLocked(vmpage)); - LASSERT((struct cl_page *)vmpage->private == page); - - /* Drop the reference count held in vvp_page_init */ - if (refcount_dec_and_test(&page->cp_ref)) { - /* It mustn't reach zero here! */ - LASSERTF(0, "page = %p, refc reached zero\n", page); - } - - ClearPagePrivate(vmpage); - vmpage->private = 0; - /* - * Reference from vmpage to cl_page is removed, but the reference back - * is still here. It is removed later in vvp_page_fini(). - */ -} - static int vvp_page_prep_read(const struct lu_env *env, const struct cl_page_slice *slice, struct cl_io *unused) @@ -313,11 +248,7 @@ static int vvp_page_fail(const struct lu_env *env, } static const struct cl_page_operations vvp_page_ops = { - .cpo_assume = vvp_page_assume, - .cpo_unassume = vvp_page_unassume, .cpo_discard = vvp_page_discard, - .cpo_delete = vvp_page_delete, - .cpo_fini = vvp_page_fini, .cpo_print = vvp_page_print, .io = { [CRT_READ] = { @@ -355,7 +286,7 @@ int vvp_page_init(const struct lu_env *env, struct cl_object *obj, &vvp_transient_page_ops); } else { get_page(vmpage); - /* in cache, decref in vvp_page_delete */ + /* in cache, decref in cl_page_delete */ refcount_inc(&page->cp_ref); SetPagePrivate(vmpage); vmpage->private = (unsigned long)page; diff --git a/fs/lustre/obdclass/cl_page.c b/fs/lustre/obdclass/cl_page.c index cff2c54..6319c3d 100644 --- a/fs/lustre/obdclass/cl_page.c +++ b/fs/lustre/obdclass/cl_page.c @@ -129,29 +129,39 @@ static void __cl_page_free(struct cl_page *cl_page, unsigned short bufsize) } } -static void cl_page_free(const struct lu_env *env, struct cl_page *cl_page, +static void cl_page_free(const struct lu_env *env, struct cl_page *cp, struct pagevec *pvec) { - struct cl_object *obj = cl_page->cp_obj; + struct cl_object *obj = cp->cp_obj; unsigned short bufsize = cl_object_header(obj)->coh_page_bufsize; - struct cl_page_slice *slice; - int i; + struct page *vmpage; - PASSERT(env, cl_page, list_empty(&cl_page->cp_batch)); - PASSERT(env, cl_page, !cl_page->cp_owner); - PASSERT(env, cl_page, cl_page->cp_state == CPS_FREEING); + PASSERT(env, cp, list_empty(&cp->cp_batch)); + PASSERT(env, cp, !cp->cp_owner); + PASSERT(env, cp, cp->cp_state == CPS_FREEING); - cl_page_slice_for_each(cl_page, slice, i) { - if (unlikely(slice->cpl_ops->cpo_fini)) - slice->cpl_ops->cpo_fini(env, slice, pvec); + if (cp->cp_type == CPT_CACHEABLE) { + /* vmpage->private was already cleared when page was + * moved into CPS_FREEING state. + */ + vmpage = cp->cp_vmpage; + LASSERT(vmpage); + LASSERT((struct cl_page *)vmpage->private != cp); + + if (pvec) { + if (!pagevec_add(pvec, vmpage)) + pagevec_release(pvec); + } else { + put_page(vmpage); + } } - cl_page->cp_layer_count = 0; - lu_object_ref_del_at(&obj->co_lu, &cl_page->cp_obj_ref, - "cl_page", cl_page); - if (cl_page->cp_type != CPT_TRANSIENT) + + cp->cp_layer_count = 0; + lu_object_ref_del_at(&obj->co_lu, &cp->cp_obj_ref, "cl_page", cp); + if (cp->cp_type != CPT_TRANSIENT) cl_object_put(env, obj); - lu_ref_fini(&cl_page->cp_reference); - __cl_page_free(cl_page, bufsize); + lu_ref_fini(&cp->cp_reference); + __cl_page_free(cp, bufsize); } static struct cl_page *__cl_page_alloc(struct cl_object *o) @@ -613,28 +623,27 @@ int cl_page_own_try(const struct lu_env *env, struct cl_io *io, * * Called when page is already locked by the hosting VM. * - * \pre !cl_page_is_owned(cl_page, io) - * \post cl_page_is_owned(cl_page, io) + * \pre !cl_page_is_owned(cp, io) + * \post cl_page_is_owned(cp, io) * * \see cl_page_operations::cpo_assume() */ void cl_page_assume(const struct lu_env *env, - struct cl_io *io, struct cl_page *cl_page) + struct cl_io *io, struct cl_page *cp) { - const struct cl_page_slice *slice; - int i; - - io = cl_io_top(io); + struct page *vmpage; - cl_page_slice_for_each(cl_page, slice, i) { - if (slice->cpl_ops->cpo_assume) - (*slice->cpl_ops->cpo_assume)(env, slice, io); + if (cp->cp_type == CPT_CACHEABLE) { + vmpage = cp->cp_vmpage; + LASSERT(vmpage); + LASSERT(PageLocked(vmpage)); + wait_on_page_writeback(vmpage); } - PASSERT(env, cl_page, !cl_page->cp_owner); - cl_page->cp_owner = cl_io_top(io); - cl_page_owner_set(cl_page); - cl_page_state_set(env, cl_page, CPS_OWNED); + PASSERT(env, cp, !cp->cp_owner); + cp->cp_owner = cl_io_top(io); + cl_page_owner_set(cp); + cl_page_state_set(env, cp, CPS_OWNED); } EXPORT_SYMBOL(cl_page_assume); @@ -644,24 +653,23 @@ void cl_page_assume(const struct lu_env *env, * Moves cl_page into cl_page_state::CPS_CACHED without releasing a lock * on the underlying VM page (as VM is supposed to do this itself). * - * \pre cl_page_is_owned(cl_page, io) - * \post !cl_page_is_owned(cl_page, io) + * \pre cl_page_is_owned(cp, io) + * \post !cl_page_is_owned(cp, io) * * \see cl_page_assume() */ void cl_page_unassume(const struct lu_env *env, - struct cl_io *io, struct cl_page *cl_page) + struct cl_io *io, struct cl_page *cp) { - const struct cl_page_slice *slice; - int i; + struct page *vmpage; - io = cl_io_top(io); - cl_page_owner_clear(cl_page); - cl_page_state_set(env, cl_page, CPS_CACHED); + cl_page_owner_clear(cp); + cl_page_state_set(env, cp, CPS_CACHED); - cl_page_slice_for_each_reverse(cl_page, slice, i) { - if (slice->cpl_ops->cpo_unassume) - (*slice->cpl_ops->cpo_unassume)(env, slice, io); + if (cp->cp_type == CPT_CACHEABLE) { + vmpage = cp->cp_vmpage; + LASSERT(vmpage); + LASSERT(PageLocked(vmpage)); } } EXPORT_SYMBOL(cl_page_unassume); @@ -721,24 +729,41 @@ void cl_page_discard(const struct lu_env *env, * cl_pages, e.g,. in a error handling cl_page_find()->__cl_page_delete() * path. Doesn't check page invariant. */ -static void __cl_page_delete(const struct lu_env *env, - struct cl_page *cl_page) +static void __cl_page_delete(const struct lu_env *env, struct cl_page *cp) { const struct cl_page_slice *slice; + struct page *vmpage; int i; - PASSERT(env, cl_page, cl_page->cp_state != CPS_FREEING); + PASSERT(env, cp, cp->cp_state != CPS_FREEING); /* * Sever all ways to obtain new pointers to @cl_page. */ - cl_page_owner_clear(cl_page); - __cl_page_state_set(env, cl_page, CPS_FREEING); + cl_page_owner_clear(cp); + __cl_page_state_set(env, cp, CPS_FREEING); - cl_page_slice_for_each_reverse(cl_page, slice, i) { + cl_page_slice_for_each_reverse(cp, slice, i) { if (slice->cpl_ops->cpo_delete) (*slice->cpl_ops->cpo_delete)(env, slice); } + + if (cp->cp_type == CPT_CACHEABLE) { + vmpage = cp->cp_vmpage; + LASSERT(PageLocked(vmpage)); + LASSERT((struct cl_page *)vmpage->private == cp); + + /* Drop the reference count held in vvp_page_init */ + refcount_dec(&cp->cp_ref); + ClearPagePrivate(vmpage); + vmpage->private = 0; + + /* + * The reference from vmpage to cl_page is removed, + * but the reference back is still here. It is removed + * later in cl_page_free(). + */ + } } /** From patchwork Thu Aug 4 01:37:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4409BC19F2D for ; Thu, 4 Aug 2022 01:38:41 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrwc3ZRBz23JT; Wed, 3 Aug 2022 18:38:40 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwR2X70z23J4 for ; Wed, 3 Aug 2022 18:38:31 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id B95EA100AFF8; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id B0EDC82CCE; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:52 -0400 Message-Id: <1659577097-19253-8-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 07/32] lustre: enc: enc-unaware clients get ENOKEY if file not found X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Sebastien Buisson To reduce issues with applications running on clients without keys or without fscrypt support that check for the existence of a file in an encrypted directory, return -ENOKEY instead of -ENOENT. For encryption-unaware clients, this is done on server side in the mdt layer, by checking if clients have the OBD_CONNECT2_ENCRYPT connection flag. For clients without the key, this is done in llite when the searched filename is not in encoded form. WC-bug-id: https://jira.whamcloud.com/browse/LU-15855 Lustre-commit: 00898697f998c095e ("LU-15855 enc: enc-unaware clients get ENOKEY if file not found") Signed-off-by: Sebastien Buisson Reviewed-on: https://review.whamcloud.com/47349 Reviewed-by: Andreas Dilger Reviewed-by: John L. Hammond Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/crypto.c | 35 ++++++++++++++++++++--------------- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/fs/lustre/llite/crypto.c b/fs/lustre/llite/crypto.c index f075b9a..ad045c3 100644 --- a/fs/lustre/llite/crypto.c +++ b/fs/lustre/llite/crypto.c @@ -233,21 +233,26 @@ int ll_setup_filename(struct inode *dir, const struct qstr *iname, fid->f_ver = 0; } rc = fscrypt_setup_filename(dir, &dname, lookup, fname); - if (rc == -ENOENT && lookup && - ((is_root_inode(dir) && iname->len == strlen(dot_fscrypt_name) && - strncmp(iname->name, dot_fscrypt_name, iname->len) == 0) || - (!fscrypt_has_encryption_key(dir) && - unlikely(filename_is_volatile(iname->name, iname->len, NULL))))) { - /* In case of subdir mount of an encrypted directory, we allow - * lookup of /.fscrypt directory. - */ - /* For purpose of migration or mirroring without enc key, we - * allow lookup of volatile file without enc context. - */ - memset(fname, 0, sizeof(struct fscrypt_name)); - fname->disk_name.name = (unsigned char *)iname->name; - fname->disk_name.len = iname->len; - rc = 0; + if (rc == -ENOENT && lookup) { + if (((is_root_inode(dir) && + iname->len == strlen(dot_fscrypt_name) && + strncmp(iname->name, dot_fscrypt_name, iname->len) == 0) || + (!fscrypt_has_encryption_key(dir) && + unlikely(filename_is_volatile(iname->name, + iname->len, NULL))))) { + /* In case of subdir mount of an encrypted directory, + * we allow lookup of /.fscrypt directory. + */ + /* For purpose of migration or mirroring without enc key, + * we allow lookup of volatile file without enc context. + */ + memset(fname, 0, sizeof(struct fscrypt_name)); + fname->disk_name.name = (unsigned char *)iname->name; + fname->disk_name.len = iname->len; + rc = 0; + } else if (!fscrypt_has_encryption_key(dir)) { + rc = -ENOKEY; + } } if (rc) return rc; From patchwork Thu Aug 4 01:37:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935980 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C2E9C19F29 for ; Thu, 4 Aug 2022 01:38:50 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrwp0VSgz23Ht; Wed, 3 Aug 2022 18:38:50 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwQ4YdSz23Hx for ; Wed, 3 Aug 2022 18:38:30 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id B8FA1100AFF7; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id B3E4C94BEB; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:53 -0400 Message-Id: <1659577097-19253-9-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 08/32] lnet: socklnd: Duplicate ksock_conn_cb X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn If two threads enter ksocknal_add_peer(), the first one to acquire the ksnd_global_lock will create a ksock_peer_ni and associate a ksock_conn_cb with it. When the second thread acquires the ksnd_global_lock it will find the existing ksock_peer_ni, but it does not check for an existing ksock_conn_cb. As a result, it overwrites the existing ksock_conn_cb (ksock_peer_ni::ksnp_conn_cb) and the ksock_conn_cb from the first thread becomes stranded. Modify ksocknal_add_peer() to check whether the peer_ni has an existing ksock_conn_cb associated with it Fixes: 3ffceb7502 ("lnet: socklnd: replace route construct") HPE-bug-id: LUS-10956 WC-bug-id: https://jira.whamcloud.com/browse/LU-15860 Lustre-commit: 0c91d49a44e1214b5 ("LU-15860 socklnd: Duplicate ksock_conn_cb") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/47361 Reviewed-by: Frank Sehr Reviewed-by: Andriy Skulysh Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/socklnd/socklnd.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c index 01b434f..2b08501 100644 --- a/net/lnet/klnds/socklnd/socklnd.c +++ b/net/lnet/klnds/socklnd/socklnd.c @@ -645,14 +645,17 @@ struct ksock_peer_ni * nidhash(&id->nid)); } - ksocknal_add_conn_cb_locked(peer_ni, conn_cb); - - /* Remember conns_per_peer setting at the time - * of connection initiation. It will define the - * max number of conns per type for this conn_cb - * while it's in use. - */ - conn_cb->ksnr_max_conns = ksocknal_get_conns_per_peer(peer_ni); + if (peer_ni->ksnp_conn_cb) { + ksocknal_conn_cb_decref(conn_cb); + } else { + ksocknal_add_conn_cb_locked(peer_ni, conn_cb); + /* Remember conns_per_peer setting at the time + * of connection initiation. It will define the + * max number of conns per type for this conn_cb + * while it's in use. + */ + conn_cb->ksnr_max_conns = ksocknal_get_conns_per_peer(peer_ni); + } write_unlock_bh(&ksocknal_data.ksnd_global_lock); From patchwork Thu Aug 4 01:37:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935981 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B70F4C19F29 for ; Thu, 4 Aug 2022 01:38:53 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrws2289z23Jk; Wed, 3 Aug 2022 18:38:53 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwS0TqYz1y7l for ; Wed, 3 Aug 2022 18:38:32 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id BAE61100AFF9; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id B710B8BBFC; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:54 -0400 Message-Id: <1659577097-19253-10-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 09/32] lustre: llite: enforce ROOT default on subdir mount X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lai Siyao , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Lai Siyao In subdirectory mount, the filesystem-wide default LMV doesn't take effect. This fix includes the following changes: * enforce the filesystem-wide default LMV on subdirectory mount if it's not set separately. * "lfs getdirstripe -D " should print the filesystem-wide default LMV. WC-bug-id: https://jira.whamcloud.com/browse/LU-15910 Lustre-commit: a162e24d2da5e4bd6 ("LU-15910 llite: enforce ROOT default on subdir mount") Signed-off-by: Lai Siyao Reviewed-on: https://review.whamcloud.com/47518 Reviewed-by: Andreas Dilger Reviewed-by: Jian Yu Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/dir.c | 75 ++++++++++++++++++++++++---------------- fs/lustre/llite/llite_internal.h | 3 ++ fs/lustre/llite/llite_lib.c | 46 ++++++++++++++++++++++-- 3 files changed, 93 insertions(+), 31 deletions(-) diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 6eaac9a..2b63c48 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -655,10 +655,9 @@ int ll_dir_setstripe(struct inode *inode, struct lov_user_md *lump, return rc; } -static int ll_dir_get_default_layout(struct inode *inode, void **plmm, - int *plmm_size, - struct ptlrpc_request **request, u64 valid, - enum get_default_layout_type type) +int ll_dir_get_default_layout(struct inode *inode, void **plmm, int *plmm_size, + struct ptlrpc_request **request, u64 valid, + enum get_default_layout_type type) { struct ll_sb_info *sbi = ll_i2sbi(inode); struct mdt_body *body; @@ -1627,35 +1626,53 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) lum = (struct lmv_user_md *)lmm; lli = ll_i2info(inode); - if (lum->lum_max_inherit == LMV_INHERIT_NONE || - (lum->lum_max_inherit > 0 && - lum->lum_max_inherit < lli->lli_dir_depth)) { - rc = -ENODATA; - goto finish_req; - } + if (lum->lum_max_inherit != + LMV_INHERIT_UNLIMITED) { + if (lum->lum_max_inherit == + LMV_INHERIT_NONE || + lum->lum_max_inherit < + LMV_INHERIT_END || + lum->lum_max_inherit > + LMV_INHERIT_MAX || + lum->lum_max_inherit < + lli->lli_dir_depth) { + rc = -ENODATA; + goto finish_req; + } + + if (lum->lum_max_inherit == + lli->lli_dir_depth) { + lum->lum_max_inherit = + LMV_INHERIT_NONE; + lum->lum_max_inherit_rr = + LMV_INHERIT_RR_NONE; + goto out_copy; + } - if (lum->lum_max_inherit == - lli->lli_dir_depth) { - lum->lum_max_inherit = LMV_INHERIT_NONE; - lum->lum_max_inherit_rr = - LMV_INHERIT_RR_NONE; - goto out_copy; - } - if (lum->lum_max_inherit > lli->lli_dir_depth && - lum->lum_max_inherit <= LMV_INHERIT_MAX) lum->lum_max_inherit -= lli->lli_dir_depth; + } - if (lum->lum_max_inherit_rr > - lli->lli_dir_depth && - lum->lum_max_inherit_rr <= - LMV_INHERIT_RR_MAX) - lum->lum_max_inherit_rr -= - lli->lli_dir_depth; - else if (lum->lum_max_inherit_rr == - lli->lli_dir_depth) - lum->lum_max_inherit_rr = - LMV_INHERIT_RR_NONE; + if (lum->lum_max_inherit_rr != + LMV_INHERIT_RR_UNLIMITED) { + if (lum->lum_max_inherit_rr == + LMV_INHERIT_NONE || + lum->lum_max_inherit_rr < + LMV_INHERIT_RR_END || + lum->lum_max_inherit_rr > + LMV_INHERIT_RR_MAX || + lum->lum_max_inherit_rr <= + lli->lli_dir_depth) { + lum->lum_max_inherit_rr = + LMV_INHERIT_RR_NONE; + goto out_copy; + } + + if (lum->lum_max_inherit_rr > + lli->lli_dir_depth) + lum->lum_max_inherit_rr -= + lli->lli_dir_depth; + } } out_copy: if (copy_to_user(ulmv, lmm, lmmsize)) diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index 70a42d4..c350440 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -1155,6 +1155,9 @@ int ll_lov_getstripe_ea_info(struct inode *inode, const char *filename, struct ptlrpc_request **request); int ll_dir_setstripe(struct inode *inode, struct lov_user_md *lump, int set_default); +int ll_dir_get_default_layout(struct inode *inode, void **plmm, int *plmm_size, + struct ptlrpc_request **request, u64 valid, + enum get_default_layout_type type); int ll_dir_getstripe_default(struct inode *inode, void **lmmp, int *lmm_size, struct ptlrpc_request **request, struct ptlrpc_request **root_request, u64 valid); diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index b55a30f..5b80722 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -2972,6 +2972,39 @@ void ll_open_cleanup(struct super_block *sb, struct req_capsule *pill) ll_finish_md_op_data(op_data); } +/* set filesystem-wide default LMV for subdir mount if it's enabled on ROOT. */ +static int ll_fileset_default_lmv_fixup(struct inode *inode, + struct lustre_md *md) +{ + struct ll_sb_info *sbi = ll_i2sbi(inode); + struct ptlrpc_request *req = NULL; + union lmv_mds_md *lmm = NULL; + int size = 0; + int rc; + + LASSERT(is_root_inode(inode)); + LASSERT(!fid_is_root(&sbi->ll_root_fid)); + LASSERT(!md->default_lmv); + + rc = ll_dir_get_default_layout(inode, (void **)&lmm, &size, &req, + OBD_MD_DEFAULT_MEA, + GET_DEFAULT_LAYOUT_ROOT); + if (rc && rc != -ENODATA) + goto out; + + rc = 0; + if (lmm && size) { + rc = md_unpackmd(sbi->ll_md_exp, &md->default_lmv, lmm, size); + if (rc < 0) + goto out; + rc = 0; + } +out: + if (req) + ptlrpc_req_finished(req); + return rc; +} + int ll_prep_inode(struct inode **inode, struct req_capsule *pill, struct super_block *sb, struct lookup_intent *it) { @@ -2993,8 +3026,17 @@ int ll_prep_inode(struct inode **inode, struct req_capsule *pill, * ll_update_lsm_md() may change md. */ if (it && (it->it_op & (IT_LOOKUP | IT_GETATTR)) && - S_ISDIR(md.body->mbo_mode) && !md.default_lmv) - default_lmv_deleted = true; + S_ISDIR(md.body->mbo_mode) && !md.default_lmv) { + if (unlikely(*inode && is_root_inode(*inode) && + !fid_is_root(&sbi->ll_root_fid))) { + rc = ll_fileset_default_lmv_fixup(*inode, &md); + if (rc) + goto out; + } + + if (!md.default_lmv) + default_lmv_deleted = true; + } if (*inode) { rc = ll_update_inode(*inode, &md); From patchwork Thu Aug 4 01:37:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935978 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CDEC4C19F2A for ; Thu, 4 Aug 2022 01:38:44 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrwh3ChVz23JL; Wed, 3 Aug 2022 18:38:44 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwS6XYLz23JB for ; Wed, 3 Aug 2022 18:38:32 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id BD269100AFFA; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id BA6388D620; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:55 -0400 Message-Id: <1659577097-19253-11-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 10/32] lnet: Replace msg_rdma_force with a new md_flag LNET_MD_FLAG_GPU. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Alexey Lyashkov , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Alexey Lyashkov HPE-bug-id: LUS-10520 WC-bug-id: https://jira.whamcloud.com/browse/LU-15189 Lustre-commit: 959304eac7ec5b156 ("LU-15189 lnet: fix memory mapping.") Signed-off-by: Alexey Lyashkov Reviewed-on: https://review.whamcloud.com/45482 Reviewed-by: Andreas Dilger Reviewed-by: Alexander Boyko Reviewed-by: Oleg Drokin HPE-bug-id: LUS-10997 WC-bug-id: https://jira.whamcloud.com/browse/LU-15914 Lustre-commit: cb0220db3ce517b0e ("LU-15914 lnet: Fix null md deref for finalized message") Signed-off-by: Chris Horn Reviewed-by: Serguei Smirnov Reviewed-by: Alexey Lyashkov Reviewed-by: James Simmons Signed-off-by: James Simmons --- fs/lustre/include/lustre_net.h | 4 +++- fs/lustre/osc/osc_request.c | 3 +++ fs/lustre/ptlrpc/pers.c | 3 +++ include/linux/lnet/lib-types.h | 3 +-- include/uapi/linux/lnet/lnet-types.h | 2 ++ net/lnet/klnds/o2iblnd/o2iblnd.h | 23 +++++++++++++++-------- net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 31 +++++++++++++++++++++---------- net/lnet/lnet/lib-md.c | 3 +++ net/lnet/lnet/lib-move.c | 10 ++++++---- 9 files changed, 57 insertions(+), 25 deletions(-) diff --git a/fs/lustre/include/lustre_net.h b/fs/lustre/include/lustre_net.h index 7d29542..f70cc7c 100644 --- a/fs/lustre/include/lustre_net.h +++ b/fs/lustre/include/lustre_net.h @@ -1186,7 +1186,9 @@ struct ptlrpc_bulk_desc { /** completed with failure */ unsigned long bd_failure:1; /** client side */ - unsigned long bd_registered:1; + unsigned long bd_registered:1, + /* bulk request is RDMA transfer, use page->host as real address */ + bd_is_rdma:1; /** For serialization with callback */ spinlock_t bd_lock; /** {put,get}{source,sink}{kiov} */ diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c index d84884f..21e036e 100644 --- a/fs/lustre/osc/osc_request.c +++ b/fs/lustre/osc/osc_request.c @@ -1416,6 +1416,7 @@ static int osc_brw_prep_request(int cmd, struct client_obd *cli, const char *obd_name = cli->cl_import->imp_obd->obd_name; struct inode *inode = NULL; bool directio = false; + bool gpu = 0; bool enable_checksum = true; struct cl_page *clpage; @@ -1581,6 +1582,7 @@ static int osc_brw_prep_request(int cmd, struct client_obd *cli, if (brw_page2oap(pga[0])->oap_brw_flags & OBD_BRW_RDMA_ONLY) { enable_checksum = false; short_io_size = 0; + gpu = 1; } /* Check if read/write is small enough to be a short io. */ @@ -1632,6 +1634,7 @@ static int osc_brw_prep_request(int cmd, struct client_obd *cli, goto out; } /* NB request now owns desc and will free it when it gets freed */ + desc->bd_is_rdma = gpu; no_bulk: body = req_capsule_client_get(pill, &RMF_OST_BODY); ioobj = req_capsule_client_get(pill, &RMF_OBD_IOOBJ); diff --git a/fs/lustre/ptlrpc/pers.c b/fs/lustre/ptlrpc/pers.c index e24c8e3..b35d2fe 100644 --- a/fs/lustre/ptlrpc/pers.c +++ b/fs/lustre/ptlrpc/pers.c @@ -58,6 +58,9 @@ void ptlrpc_fill_bulk_md(struct lnet_md *md, struct ptlrpc_bulk_desc *desc, return; } + if (desc->bd_is_rdma) + md->options |= LNET_MD_GPU_ADDR; + if (mdidx == (desc->bd_md_count - 1)) md->length = desc->bd_iov_count - start; else diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h index f7f0b0b..1827f4e 100644 --- a/include/linux/lnet/lib-types.h +++ b/include/linux/lnet/lib-types.h @@ -138,8 +138,6 @@ struct lnet_msg { enum lnet_msg_hstatus msg_health_status; /* This is a recovery message */ bool msg_recovery; - /* force an RDMA even if the message size is < 4K */ - bool msg_rdma_force; /* the number of times a transmission has been retried */ int msg_retry_count; /* flag to indicate that we do not want to resend this message */ @@ -245,6 +243,7 @@ struct lnet_libmd { */ #define LNET_MD_FLAG_HANDLING BIT(3) #define LNET_MD_FLAG_DISCARD BIT(4) +#define LNET_MD_FLAG_GPU BIT(5) /**< Special mapping needs */ struct lnet_test_peer { /* info about peers we are trying to fail */ diff --git a/include/uapi/linux/lnet/lnet-types.h b/include/uapi/linux/lnet/lnet-types.h index c5fca5c..5a2ea45 100644 --- a/include/uapi/linux/lnet/lnet-types.h +++ b/include/uapi/linux/lnet/lnet-types.h @@ -467,6 +467,8 @@ struct lnet_md { #define LNET_MD_TRACK_RESPONSE (1 << 10) /** See struct lnet_md::options. */ #define LNET_MD_NO_TRACK_RESPONSE (1 << 11) +/** Special page mapping handling */ +#define LNET_MD_GPU_ADDR (1 << 13) /** Infinite threshold on MD operations. See lnet_md::threshold */ #define LNET_MD_THRESH_INF (-1) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.h b/net/lnet/klnds/o2iblnd/o2iblnd.h index e798695..0066e85 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.h +++ b/net/lnet/klnds/o2iblnd/o2iblnd.h @@ -401,8 +401,9 @@ struct kib_tx { /* transmit message */ struct kib_tx_pool *tx_pool; /* pool I'm from */ struct kib_conn *tx_conn; /* owning conn */ short tx_sending; /* # tx callbacks outstanding */ - short tx_queued; /* queued for sending */ - short tx_waiting; /* waiting for peer_ni */ + unsigned long tx_queued:1, /* queued for sending */ + tx_waiting:1, /* waiting for peer_ni */ + tx_gpu:1; /* force DMA */ int tx_status; /* LNET completion status */ enum lnet_msg_hstatus tx_hstatus; /* health status of the transmit */ ktime_t tx_deadline; /* completion deadline */ @@ -861,17 +862,23 @@ static inline void kiblnd_dma_unmap_single(struct ib_device *dev, #define KIBLND_UNMAP_ADDR_SET(p, m, a) do {} while (0) #define KIBLND_UNMAP_ADDR(p, m, a) (a) -static inline int kiblnd_dma_map_sg(struct kib_hca_dev *hdev, - struct scatterlist *sg, int nents, - enum dma_data_direction direction) +static inline +int kiblnd_dma_map_sg(struct kib_hca_dev *hdev, struct kib_tx *tx) { + struct scatterlist *sg = tx->tx_frags; + int nents = tx->tx_nfrags; + enum dma_data_direction direction = tx->tx_dmadir; + return ib_dma_map_sg(hdev->ibh_ibdev, sg, nents, direction); } -static inline void kiblnd_dma_unmap_sg(struct kib_hca_dev *hdev, - struct scatterlist *sg, int nents, - enum dma_data_direction direction) +static inline +void kiblnd_dma_unmap_sg(struct kib_hca_dev *hdev, struct kib_tx *tx) { + struct scatterlist *sg = tx->tx_frags; + int nents = tx->tx_nfrags; + enum dma_data_direction direction = tx->tx_dmadir; + ib_dma_unmap_sg(hdev->ibh_ibdev, sg, nents, direction); } diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index cb96282..01fa499 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -623,8 +623,7 @@ static void kiblnd_unmap_tx(struct kib_tx *tx) kiblnd_fmr_pool_unmap(&tx->tx_fmr, tx->tx_status); if (tx->tx_nfrags) { - kiblnd_dma_unmap_sg(tx->tx_pool->tpo_hdev, - tx->tx_frags, tx->tx_nfrags, tx->tx_dmadir); + kiblnd_dma_unmap_sg(tx->tx_pool->tpo_hdev, tx); tx->tx_nfrags = 0; } } @@ -644,9 +643,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, tx->tx_dmadir = (rd != tx->tx_rd) ? DMA_FROM_DEVICE : DMA_TO_DEVICE; tx->tx_nfrags = nfrags; - rd->rd_nfrags = kiblnd_dma_map_sg(hdev, tx->tx_frags, - tx->tx_nfrags, tx->tx_dmadir); - + rd->rd_nfrags = kiblnd_dma_map_sg(hdev, tx); for (i = 0, nob = 0; i < rd->rd_nfrags; i++) { rd->rd_frags[i].rf_nob = kiblnd_sg_dma_len( hdev->ibh_ibdev, &tx->tx_frags[i]); @@ -1076,7 +1073,8 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, int prev = dstidx; if (srcidx >= srcrd->rd_nfrags) { - CERROR("Src buffer exhausted: %d frags\n", srcidx); + CERROR("Src buffer exhausted: %d frags %px\n", + srcidx, tx); rc = -EPROTO; break; } @@ -1540,10 +1538,12 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, struct bio_vec *payload_kiov = lntmsg->msg_kiov; unsigned int payload_offset = lntmsg->msg_offset; unsigned int payload_nob = lntmsg->msg_len; + struct lnet_libmd *msg_md = lntmsg->msg_md; struct iov_iter from; struct kib_msg *ibmsg; struct kib_rdma_desc *rd; struct kib_tx *tx; + bool gpu; int nob; int rc; @@ -1571,6 +1571,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, return -ENOMEM; } ibmsg = tx->tx_msg; + gpu = msg_md ? (msg_md->md_flags & LNET_MD_FLAG_GPU) : false; switch (type) { default: @@ -1586,11 +1587,13 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, break; /* send IMMEDIATE */ /* is the REPLY message too small for RDMA? */ - nob = offsetof(struct kib_msg, ibm_u.immediate.ibim_payload[lntmsg->msg_md->md_length]); - if (nob <= IBLND_MSG_SIZE && !lntmsg->msg_rdma_force) + nob = offsetof(struct kib_msg, + ibm_u.immediate.ibim_payload[lntmsg->msg_md->md_length]); + if (nob <= IBLND_MSG_SIZE && !gpu) break; /* send IMMEDIATE */ rd = &ibmsg->ibm_u.get.ibgm_rd; + tx->tx_gpu = gpu; rc = kiblnd_setup_rd_kiov(ni, tx, rd, payload_niov, payload_kiov, payload_offset, payload_nob); @@ -1626,9 +1629,11 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, case LNET_MSG_PUT: /* Is the payload small enough not to need RDMA? */ nob = offsetof(struct kib_msg, ibm_u.immediate.ibim_payload[payload_nob]); - if (nob <= IBLND_MSG_SIZE && !lntmsg->msg_rdma_force) + if (nob <= IBLND_MSG_SIZE && !gpu) break; /* send IMMEDIATE */ + tx->tx_gpu = gpu; + rc = kiblnd_setup_rd_kiov(ni, tx, tx->tx_rd, payload_niov, payload_kiov, payload_offset, payload_nob); @@ -1712,6 +1717,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, struct bio_vec *kiov = lntmsg->msg_kiov; unsigned int offset = lntmsg->msg_offset; unsigned int nob = lntmsg->msg_len; + struct lnet_libmd *payload_md = lntmsg->msg_md; struct kib_tx *tx; int rc; @@ -1722,6 +1728,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, goto failed_0; } + tx->tx_gpu = !!(payload_md->md_flags & LNET_MD_FLAG_GPU); if (!nob) rc = 0; else @@ -1784,7 +1791,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, struct kib_tx *tx; int nob; int post_credit = IBLND_POSTRX_PEER_CREDIT; - u64 ibprm_cookie = rxmsg->ibm_u.putreq.ibprm_cookie; + u64 ibprm_cookie; int rc = 0; LASSERT(iov_iter_count(to) <= rlen); @@ -1819,6 +1826,9 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, case IBLND_MSG_PUT_REQ: { struct kib_msg *txmsg; struct kib_rdma_desc *rd; + struct lnet_libmd *payload_md = lntmsg->msg_md; + + ibprm_cookie = rxmsg->ibm_u.putreq.ibprm_cookie; if (!iov_iter_count(to)) { lnet_finalize(lntmsg, 0); @@ -1836,6 +1846,7 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, break; } + tx->tx_gpu = !!(payload_md->md_flags & LNET_MD_FLAG_GPU); txmsg = tx->tx_msg; rd = &txmsg->ibm_u.putack.ibpam_rd; rc = kiblnd_setup_rd_kiov(ni, tx, rd, diff --git a/net/lnet/lnet/lib-md.c b/net/lnet/lnet/lib-md.c index affa921..05fb666 100644 --- a/net/lnet/lnet/lib-md.c +++ b/net/lnet/lnet/lib-md.c @@ -192,6 +192,9 @@ struct page * lmd->md_flags = (unlink == LNET_UNLINK) ? LNET_MD_FLAG_AUTO_UNLINK : 0; lmd->md_bulk_handle = umd->bulk_handle; + if (umd->options & LNET_MD_GPU_ADDR) + lmd->md_flags |= LNET_MD_FLAG_GPU; + if (umd->options & LNET_MD_KIOV) { niov = umd->length; lmd->md_niov = umd->length; diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index 0c5bf82..53e953f 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -1450,11 +1450,13 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, u32 best_sel_prio; unsigned int best_dev_prio; unsigned int dev_idx = UINT_MAX; - struct page *page = lnet_get_first_page(md, offset); + bool gpu = md ? (md->md_flags & LNET_MD_FLAG_GPU) : false; + + if (gpu) { + struct page *page = lnet_get_first_page(md, offset); - msg->msg_rdma_force = lnet_is_rdma_only_page(page); - if (msg->msg_rdma_force) dev_idx = lnet_get_dev_idx(page); + } /* If there is no peer_ni that we can send to on this network, * then there is no point in looking for a new best_ni here. @@ -1505,7 +1507,7 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, * All distances smaller than the NUMA range * are treated equally. */ - if (distance < lnet_numa_range) + if (!gpu && distance < lnet_numa_range) distance = lnet_numa_range; /* * Select on health, selection policy, direct dma prio, From patchwork Thu Aug 4 01:37:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935982 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3DDBFC19F29 for ; Thu, 4 Aug 2022 01:38:57 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrww6C0fz23Jh; Wed, 3 Aug 2022 18:38:56 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwT4sKQz23JB for ; Wed, 3 Aug 2022 18:38:33 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id BFB60100B000; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id BE2E380795; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:56 -0400 Message-Id: <1659577097-19253-12-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 11/32] lustre: som: disabling xattr cache for LSOM on client X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin To obtain uptodate LSOM data, currently a client needs to set llite.*.xattr_cache =0 to disable the xattr cache on client completely. This leads that other kinds of xattr can not be cached on the client too. This patch introduces a heavy-weight solution to disable caching only for LSOM xattr data ("trusted.som") on client. WC-bug-id: https://jira.whamcloud.com/browse/LU-11695 Lustre-commit: 192902851d73ec246 ("LU-11695 som: disabling xattr cache for LSOM on client") Signed-off-by: Qian Yingjin Reviewed-on: https://review.whamcloud.com/33711 Reviewed-by: Andreas Dilger Reviewed-by: Li Xi Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/xattr.c | 3 ++- fs/lustre/llite/xattr_cache.c | 4 ++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/lustre/llite/xattr.c b/fs/lustre/llite/xattr.c index 3a342ad..11310f9 100644 --- a/fs/lustre/llite/xattr.c +++ b/fs/lustre/llite/xattr.c @@ -373,7 +373,8 @@ int ll_xattr_list(struct inode *inode, const char *name, int type, void *buffer, } if (sbi->ll_xattr_cache_enabled && type != XATTR_ACL_ACCESS_T && - (type != XATTR_SECURITY_T || strcmp(name, "security.selinux"))) { + (type != XATTR_SECURITY_T || strcmp(name, "security.selinux")) && + (type != XATTR_TRUSTED_T || strcmp(name, XATTR_NAME_SOM))) { rc = ll_xattr_cache_get(inode, name, buffer, size, valid); if (rc == -EAGAIN) goto getxattr_nocache; diff --git a/fs/lustre/llite/xattr_cache.c b/fs/lustre/llite/xattr_cache.c index 723cc39..7e5b807 100644 --- a/fs/lustre/llite/xattr_cache.c +++ b/fs/lustre/llite/xattr_cache.c @@ -465,6 +465,10 @@ static int ll_xattr_cache_refill(struct inode *inode) /* Filter out security.selinux, it is cached in slab */ CDEBUG(D_CACHE, "not caching security.selinux\n"); rc = 0; + } else if (!strcmp(xdata, XATTR_NAME_SOM)) { + /* Filter out trusted.som, it is not cached on client */ + CDEBUG(D_CACHE, "not caching trusted.som\n"); + rc = 0; } else { rc = ll_xattr_cache_add(&lli->lli_xattrs, xdata, xval, *xsizes); From patchwork Thu Aug 4 01:37:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B5BD5C19F29 for ; Thu, 4 Aug 2022 01:39:00 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrx01l7tz1y7h; Wed, 3 Aug 2022 18:39:00 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwV2gwxz23JB for ; Wed, 3 Aug 2022 18:38:34 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id C3908100B001; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C15588D626; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:57 -0400 Message-Id: <1659577097-19253-13-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 12/32] lnet: discard some peer_ni lookup functions X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown lnet_nid2peerni_locked(), lnet_peer_get_ni_locked(), lnet_find_peer4(), and lnet_find_peer_ni_locked() each have few users left and that can call be change to use alternate versions which take 'struct lnet_nid' rather than 'lnet_nid_t'. So convert all those callers over, and discard the older functions. WC-bug-id: https://jira.whamcloud.com/browse/LU-10391 Lustre-commit: 9768d8929a305588f ("LU-10391 lnet: discard some peer_ni lookup functions") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/44624 Reviewed-by: James Simmons Reviewed-by: Frank Sehr Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-lnet.h | 6 -- net/lnet/lnet/api-ni.c | 26 +++--- net/lnet/lnet/lib-move.c | 8 +- net/lnet/lnet/peer.c | 211 +++++++++++++++--------------------------- net/lnet/lnet/router.c | 18 ++-- net/lnet/lnet/udsp.c | 8 +- 6 files changed, 110 insertions(+), 167 deletions(-) diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index e21866b..3bdb49e 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -898,19 +898,13 @@ struct lnet_peer_net *lnet_get_next_peer_net_locked(struct lnet_peer *lp, struct lnet_peer_ni *lnet_get_next_peer_ni_locked(struct lnet_peer *peer, struct lnet_peer_net *peer_net, struct lnet_peer_ni *prev); -struct lnet_peer_ni *lnet_nid2peerni_locked(lnet_nid_t nid, lnet_nid_t pref, - int cpt); struct lnet_peer_ni *lnet_peerni_by_nid_locked(struct lnet_nid *nid, struct lnet_nid *pref, int cpt); struct lnet_peer_ni *lnet_nid2peerni_ex(struct lnet_nid *nid); -struct lnet_peer_ni *lnet_peer_get_ni_locked(struct lnet_peer *lp, - lnet_nid_t nid); struct lnet_peer_ni *lnet_peer_ni_get_locked(struct lnet_peer *lp, struct lnet_nid *nid); -struct lnet_peer_ni *lnet_find_peer_ni_locked(lnet_nid_t nid); struct lnet_peer_ni *lnet_peer_ni_find_locked(struct lnet_nid *nid); -struct lnet_peer *lnet_find_peer4(lnet_nid_t nid); struct lnet_peer *lnet_find_peer(struct lnet_nid *nid); void lnet_peer_net_added(struct lnet_net *net); void lnet_peer_primary_nid_locked(struct lnet_nid *nid, diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index 165728d..124ec86 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -4381,7 +4381,8 @@ u32 lnet_get_dlc_seq_locked(void) return rc; mutex_lock(&the_lnet.ln_api_mutex); - lp = lnet_find_peer4(ping->ping_id.nid); + lnet_nid4_to_nid(ping->ping_id.nid, &nid); + lp = lnet_find_peer(&nid); if (lp) { ping->ping_id.nid = lnet_nid_to_nid4(&lp->lp_primary_nid); @@ -4405,7 +4406,8 @@ u32 lnet_get_dlc_seq_locked(void) return rc; mutex_lock(&the_lnet.ln_api_mutex); - lp = lnet_find_peer4(discover->ping_id.nid); + lnet_nid4_to_nid(discover->ping_id.nid, &nid); + lp = lnet_find_peer(&nid); if (lp) { discover->ping_id.nid = lnet_nid_to_nid4(&lp->lp_primary_nid); @@ -4687,7 +4689,7 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid, if (nob < 8) { CERROR("%s: ping info too short %d\n", - libcfs_id2str(id4), nob); + libcfs_idstr(&id), nob); goto fail_ping_buffer_decref; } @@ -4695,19 +4697,19 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid, lnet_swap_pinginfo(pbuf); } else if (pbuf->pb_info.pi_magic != LNET_PROTO_PING_MAGIC) { CERROR("%s: Unexpected magic %08x\n", - libcfs_id2str(id4), pbuf->pb_info.pi_magic); + libcfs_idstr(&id), pbuf->pb_info.pi_magic); goto fail_ping_buffer_decref; } if (!(pbuf->pb_info.pi_features & LNET_PING_FEAT_NI_STATUS)) { CERROR("%s: ping w/o NI status: 0x%x\n", - libcfs_id2str(id4), pbuf->pb_info.pi_features); + libcfs_idstr(&id), pbuf->pb_info.pi_features); goto fail_ping_buffer_decref; } if (nob < LNET_PING_INFO_SIZE(0)) { CERROR("%s: Short reply %d(%d min)\n", - libcfs_id2str(id4), + libcfs_idstr(&id), nob, (int)LNET_PING_INFO_SIZE(0)); goto fail_ping_buffer_decref; } @@ -4717,7 +4719,7 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid, if (nob < LNET_PING_INFO_SIZE(n_ids)) { CERROR("%s: Short reply %d(%d expected)\n", - libcfs_id2str(id4), + libcfs_idstr(&id), nob, (int)LNET_PING_INFO_SIZE(n_ids)); goto fail_ping_buffer_decref; } @@ -4739,7 +4741,7 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid, } static int -lnet_discover(struct lnet_process_id id, u32 force, +lnet_discover(struct lnet_process_id id4, u32 force, struct lnet_process_id __user *ids, int n_ids) { @@ -4747,14 +4749,16 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid, struct lnet_peer_ni *p; struct lnet_peer *lp; struct lnet_process_id *buf; + struct lnet_processid id; int cpt; int i; int rc; if (n_ids <= 0 || - id.nid == LNET_NID_ANY) + id4.nid == LNET_NID_ANY) return -EINVAL; + lnet_pid4_to_pid(id4, &id); if (id.pid == LNET_PID_ANY) id.pid = LNET_PID_LUSTRE; @@ -4769,7 +4773,7 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid, return -ENOMEM; cpt = lnet_net_lock_current(); - lpni = lnet_nid2peerni_locked(id.nid, LNET_NID_ANY, cpt); + lpni = lnet_peerni_by_nid_locked(&id.nid, NULL, cpt); if (IS_ERR(lpni)) { rc = PTR_ERR(lpni); goto out; @@ -4795,7 +4799,7 @@ static int lnet_ping(struct lnet_process_id id4, struct lnet_nid *src_nid, * and lookup the lpni again */ lnet_peer_ni_decref_locked(lpni); - lpni = lnet_find_peer_ni_locked(id.nid); + lpni = lnet_peer_ni_find_locked(&id.nid); if (!lpni) { rc = -ENOENT; goto out; diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index 53e953f..a514472 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -1907,7 +1907,7 @@ struct lnet_ni * return rc; } - new_lpni = lnet_find_peer_ni_locked(lnet_nid_to_nid4(&lpni->lpni_nid)); + new_lpni = lnet_peer_ni_find_locked(&lpni->lpni_nid); if (!new_lpni) { lnet_peer_ni_decref_locked(lpni); return -ENOENT; @@ -2795,7 +2795,7 @@ struct lnet_ni * * try to send it via non-multi-rail criteria */ if (!IS_ERR(src_lpni)) { - /* Drop ref taken by lnet_nid2peerni_locked() */ + /* Drop ref taken by lnet_peerni_by_nid_locked() */ lnet_peer_ni_decref_locked(src_lpni); src_lp = lpni->lpni_peer_net->lpn_peer; if (lnet_peer_is_multi_rail(src_lp) && @@ -3523,7 +3523,7 @@ struct lnet_mt_event_info { ev_info, the_lnet.ln_mt_handler, true); lnet_net_lock(0); - /* lnet_find_peer_ni_locked() grabs a refcount for + /* lnet_peer_ni_find_locked() grabs a refcount for * us. No need to take it explicitly. */ lpni = lnet_peer_ni_find_locked(&nid); @@ -3546,7 +3546,7 @@ struct lnet_mt_event_info { spin_unlock(&lpni->lpni_lock); } - /* Drop the ref taken by lnet_find_peer_ni_locked() */ + /* Drop the ref taken by lnet_peer_ni_find_locked() */ lnet_peer_ni_decref_locked(lpni); lnet_net_unlock(0); } else { diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c index 3909c5d..7a96a2f 100644 --- a/net/lnet/lnet/peer.c +++ b/net/lnet/lnet/peer.c @@ -698,24 +698,6 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp) } struct lnet_peer_ni * -lnet_find_peer_ni_locked(lnet_nid_t nid4) -{ - struct lnet_peer_ni *lpni; - struct lnet_peer_table *ptable; - int cpt; - struct lnet_nid nid; - - lnet_nid4_to_nid(nid4, &nid); - - cpt = lnet_nid_cpt_hash(&nid, LNET_CPT_NUMBER); - - ptable = the_lnet.ln_peer_tables[cpt]; - lpni = lnet_get_peer_ni_locked(ptable, &nid); - - return lpni; -} - -struct lnet_peer_ni * lnet_peer_ni_find_locked(struct lnet_nid *nid) { struct lnet_peer_ni *lpni; @@ -731,24 +713,6 @@ struct lnet_peer_ni * } struct lnet_peer_ni * -lnet_peer_get_ni_locked(struct lnet_peer *lp, lnet_nid_t nid) -{ - struct lnet_peer_net *lpn; - struct lnet_peer_ni *lpni; - - lpn = lnet_peer_get_net_locked(lp, LNET_NIDNET(nid)); - if (!lpn) - return NULL; - - list_for_each_entry(lpni, &lpn->lpn_peer_nis, lpni_peer_nis) { - if (lnet_nid_to_nid4(&lpni->lpni_nid) == nid) - return lpni; - } - - return NULL; -} - -struct lnet_peer_ni * lnet_peer_ni_get_locked(struct lnet_peer *lp, struct lnet_nid *nid) { struct lnet_peer_net *lpn; @@ -767,25 +731,6 @@ struct lnet_peer_ni * } struct lnet_peer * -lnet_find_peer4(lnet_nid_t nid) -{ - struct lnet_peer_ni *lpni; - struct lnet_peer *lp = NULL; - int cpt; - - cpt = lnet_net_lock_current(); - lpni = lnet_find_peer_ni_locked(nid); - if (lpni) { - lp = lpni->lpni_peer_net->lpn_peer; - lnet_peer_addref_locked(lp); - lnet_peer_ni_decref_locked(lpni); - } - lnet_net_unlock(cpt); - - return lp; -} - -struct lnet_peer * lnet_find_peer(struct lnet_nid *nid) { struct lnet_peer_ni *lpni; @@ -1620,21 +1565,20 @@ struct lnet_peer_net * * Call with the lnet_api_mutex held. */ static int -lnet_peer_add(lnet_nid_t nid4, unsigned int flags) +lnet_peer_add(struct lnet_nid *nid, unsigned int flags) { - struct lnet_nid nid; struct lnet_peer *lp; struct lnet_peer_net *lpn; struct lnet_peer_ni *lpni; int rc = 0; - LASSERT(nid4 != LNET_NID_ANY); + LASSERT(nid); /* * No need for the lnet_net_lock here, because the * lnet_api_mutex is held. */ - lpni = lnet_find_peer_ni_locked(nid4); + lpni = lnet_peer_ni_find_locked(nid); if (lpni) { /* A peer with this NID already exists. */ lp = lpni->lpni_peer_net->lpn_peer; @@ -1646,13 +1590,13 @@ struct lnet_peer_net * * that an existing peer is being modified. */ if (lp->lp_state & LNET_PEER_CONFIGURED) { - if (lnet_nid_to_nid4(&lp->lp_primary_nid) != nid4) + if (!nid_same(&lp->lp_primary_nid, nid)) rc = -EEXIST; else if ((lp->lp_state ^ flags) & LNET_PEER_MULTI_RAIL) rc = -EPERM; goto out; } else if (!(flags & LNET_PEER_CONFIGURED)) { - if (lnet_nid_to_nid4(&lp->lp_primary_nid) == nid4) { + if (nid_same(&lp->lp_primary_nid, nid)) { rc = -EEXIST; goto out; } @@ -1665,14 +1609,13 @@ struct lnet_peer_net * /* Create peer, peer_net, and peer_ni. */ rc = -ENOMEM; - lnet_nid4_to_nid(nid4, &nid); - lp = lnet_peer_alloc(&nid); + lp = lnet_peer_alloc(nid); if (!lp) goto out; - lpn = lnet_peer_net_alloc(LNET_NID_NET(&nid)); + lpn = lnet_peer_net_alloc(LNET_NID_NET(nid)); if (!lpn) goto out_free_lp; - lpni = lnet_peer_ni_alloc(&nid); + lpni = lnet_peer_ni_alloc(nid); if (!lpni) goto out_free_lpn; @@ -1684,7 +1627,7 @@ struct lnet_peer_net * kfree(lp); out: CDEBUG(D_NET, "peer %s NID flags %#x: %d\n", - libcfs_nid2str(nid4), flags, rc); + libcfs_nidstr(nid), flags, rc); return rc; } @@ -1699,17 +1642,15 @@ struct lnet_peer_net * * non-multi-rail peer. */ static int -lnet_peer_add_nid(struct lnet_peer *lp, lnet_nid_t nid4, unsigned int flags) +lnet_peer_add_nid(struct lnet_peer *lp, struct lnet_nid *nid, + unsigned int flags) { struct lnet_peer_net *lpn; struct lnet_peer_ni *lpni; - struct lnet_nid nid; int rc = 0; LASSERT(lp); - LASSERT(nid4 != LNET_NID_ANY); - - lnet_nid4_to_nid(nid4, &nid); + LASSERT(nid); /* A configured peer can only be updated through configuration. */ if (!(flags & LNET_PEER_CONFIGURED)) { @@ -1735,7 +1676,7 @@ struct lnet_peer_net * goto out; } - lpni = lnet_find_peer_ni_locked(nid4); + lpni = lnet_peer_ni_find_locked(nid); if (lpni) { /* * A peer_ni already exists. This is only a problem if @@ -1764,14 +1705,14 @@ struct lnet_peer_net * } lnet_peer_del(lpni->lpni_peer_net->lpn_peer); lnet_peer_ni_decref_locked(lpni); - lpni = lnet_peer_ni_alloc(&nid); + lpni = lnet_peer_ni_alloc(nid); if (!lpni) { rc = -ENOMEM; goto out_free_lpni; } } } else { - lpni = lnet_peer_ni_alloc(&nid); + lpni = lnet_peer_ni_alloc(nid); if (!lpni) { rc = -ENOMEM; goto out_free_lpni; @@ -1782,9 +1723,9 @@ struct lnet_peer_net * * Get the peer_net. Check that we're not adding a second * peer_ni on a peer_net of a non-multi-rail peer. */ - lpn = lnet_peer_get_net_locked(lp, LNET_NIDNET(nid4)); + lpn = lnet_peer_get_net_locked(lp, LNET_NID_NET(nid)); if (!lpn) { - lpn = lnet_peer_net_alloc(LNET_NIDNET(nid4)); + lpn = lnet_peer_net_alloc(LNET_NID_NET(nid)); if (!lpn) { rc = -ENOMEM; goto out_free_lpni; @@ -1800,7 +1741,7 @@ struct lnet_peer_net * lnet_peer_ni_decref_locked(lpni); out: CDEBUG(D_NET, "peer %s NID %s flags %#x: %d\n", - libcfs_nidstr(&lp->lp_primary_nid), libcfs_nid2str(nid4), + libcfs_nidstr(&lp->lp_primary_nid), libcfs_nidstr(nid), flags, rc); return rc; } @@ -1811,16 +1752,16 @@ struct lnet_peer_net * * Call with the lnet_api_mutex held. */ static int -lnet_peer_set_primary_nid(struct lnet_peer *lp, lnet_nid_t nid, +lnet_peer_set_primary_nid(struct lnet_peer *lp, struct lnet_nid *nid, unsigned int flags) { struct lnet_nid old = lp->lp_primary_nid; int rc = 0; - if (lnet_nid_to_nid4(&lp->lp_primary_nid) == nid) + if (nid_same(&lp->lp_primary_nid, nid)) goto out; - lnet_nid4_to_nid(nid, &lp->lp_primary_nid); + lp->lp_primary_nid = *nid; rc = lnet_peer_add_nid(lp, nid, flags); if (rc) { @@ -1829,7 +1770,7 @@ struct lnet_peer_net * } out: CDEBUG(D_NET, "peer %s NID %s: %d\n", - libcfs_nidstr(&old), libcfs_nid2str(nid), rc); + libcfs_nidstr(&old), libcfs_nidstr(nid), rc); return rc; } @@ -1908,16 +1849,20 @@ struct lnet_peer_net * * being created/modified/deleted by a different thread. */ int -lnet_add_peer_ni(lnet_nid_t prim_nid, lnet_nid_t nid, bool mr, bool temp) +lnet_add_peer_ni(lnet_nid_t prim_nid4, lnet_nid_t nid4, bool mr, bool temp) { + struct lnet_nid prim_nid, nid; struct lnet_peer *lp = NULL; struct lnet_peer_ni *lpni; unsigned int flags = 0; /* The prim_nid must always be specified */ - if (prim_nid == LNET_NID_ANY) + if (prim_nid4 == LNET_NID_ANY) return -EINVAL; + lnet_nid4_to_nid(prim_nid4, &prim_nid); + lnet_nid4_to_nid(nid4, &nid); + if (!temp) flags = LNET_PEER_CONFIGURED; @@ -1928,11 +1873,11 @@ struct lnet_peer_net * * If nid isn't specified, we must create a new peer with * prim_nid as its primary nid. */ - if (nid == LNET_NID_ANY) - return lnet_peer_add(prim_nid, flags); + if (nid4 == LNET_NID_ANY) + return lnet_peer_add(&prim_nid, flags); /* Look up the prim_nid, which must exist. */ - lpni = lnet_find_peer_ni_locked(prim_nid); + lpni = lnet_peer_ni_find_locked(&prim_nid); if (!lpni) return -ENOENT; lnet_peer_ni_decref_locked(lpni); @@ -1941,14 +1886,14 @@ struct lnet_peer_net * /* Peer must have been configured. */ if (!temp && !(lp->lp_state & LNET_PEER_CONFIGURED)) { CDEBUG(D_NET, "peer %s was not configured\n", - libcfs_nid2str(prim_nid)); + libcfs_nidstr(&prim_nid)); return -ENOENT; } /* Primary NID must match */ - if (lnet_nid_to_nid4(&lp->lp_primary_nid) != prim_nid) { + if (!nid_same(&lp->lp_primary_nid, &prim_nid)) { CDEBUG(D_NET, "prim_nid %s is not primary for peer %s\n", - libcfs_nid2str(prim_nid), + libcfs_nidstr(&prim_nid), libcfs_nidstr(&lp->lp_primary_nid)); return -ENODEV; } @@ -1956,11 +1901,11 @@ struct lnet_peer_net * /* Multi-Rail flag must match. */ if ((lp->lp_state ^ flags) & LNET_PEER_MULTI_RAIL) { CDEBUG(D_NET, "multi-rail state mismatch for peer %s\n", - libcfs_nid2str(prim_nid)); + libcfs_nidstr(&prim_nid)); return -EPERM; } - return lnet_peer_add_nid(lp, nid, flags); + return lnet_peer_add_nid(lp, &nid, flags); } /* @@ -1975,24 +1920,26 @@ struct lnet_peer_net * * being modified/deleted by a different thread. */ int -lnet_del_peer_ni(lnet_nid_t prim_nid, lnet_nid_t nid) +lnet_del_peer_ni(lnet_nid_t prim_nid4, lnet_nid_t nid) { struct lnet_peer *lp; struct lnet_peer_ni *lpni; unsigned int flags; + struct lnet_nid prim_nid; - if (prim_nid == LNET_NID_ANY) + if (prim_nid4 == LNET_NID_ANY) return -EINVAL; + lnet_nid4_to_nid(prim_nid4, &prim_nid); - lpni = lnet_find_peer_ni_locked(prim_nid); + lpni = lnet_peer_ni_find_locked(&prim_nid); if (!lpni) return -ENOENT; lnet_peer_ni_decref_locked(lpni); lp = lpni->lpni_peer_net->lpn_peer; - if (prim_nid != lnet_nid_to_nid4(&lp->lp_primary_nid)) { + if (!nid_same(&prim_nid, &lp->lp_primary_nid)) { CDEBUG(D_NET, "prim_nid %s is not primary for peer %s\n", - libcfs_nid2str(prim_nid), + libcfs_nidstr(&prim_nid), libcfs_nidstr(&lp->lp_primary_nid)); return -ENODEV; } @@ -2001,7 +1948,7 @@ struct lnet_peer_net * if (lp->lp_rtr_refcount > 0) { lnet_net_unlock(LNET_LOCK_EX); CERROR("%s is a router. Can not be deleted\n", - libcfs_nid2str(prim_nid)); + libcfs_nidstr(&prim_nid)); return -EBUSY; } lnet_net_unlock(LNET_LOCK_EX); @@ -2141,19 +2088,6 @@ struct lnet_peer_ni * return lpni; } -struct lnet_peer_ni * -lnet_nid2peerni_locked(lnet_nid_t nid4, lnet_nid_t pref4, int cpt) -{ - struct lnet_nid nid, pref; - - lnet_nid4_to_nid(nid4, &nid); - lnet_nid4_to_nid(pref4, &pref); - if (pref4 == LNET_NID_ANY) - return lnet_peerni_by_nid_locked(&nid, NULL, cpt); - else - return lnet_peerni_by_nid_locked(&nid, &pref, cpt); -} - bool lnet_peer_gw_discovery(struct lnet_peer *lp) { @@ -2964,6 +2898,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, lnet_nid_t *curnis = NULL; struct lnet_ni_status *addnis = NULL; lnet_nid_t *delnis = NULL; + struct lnet_nid nid; unsigned int flags; int ncurnis; int naddnis; @@ -3031,7 +2966,8 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, * peer with the latest information we * received */ - lpni = lnet_find_peer_ni_locked(curnis[i]); + lnet_nid4_to_nid(curnis[i], &nid); + lpni = lnet_peer_ni_find_locked(&nid); if (lpni) { lpni->lpni_ns_status = pbuf->pb_info.pi_ni[j].ns_status; @@ -3053,7 +2989,8 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, goto out; for (i = 0; i < naddnis; i++) { - rc = lnet_peer_add_nid(lp, addnis[i].ns_nid, flags); + lnet_nid4_to_nid(addnis[i].ns_nid, &nid); + rc = lnet_peer_add_nid(lp, &nid, flags); if (rc) { CERROR("Error adding NID %s to peer %s: %d\n", libcfs_nid2str(addnis[i].ns_nid), @@ -3061,7 +2998,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, if (rc == -ENOMEM) goto out; } - lpni = lnet_find_peer_ni_locked(addnis[i].ns_nid); + lpni = lnet_peer_ni_find_locked(&nid); if (lpni) { lpni->lpni_ns_status = addnis[i].ns_status; lnet_peer_ni_decref_locked(lpni); @@ -3090,7 +3027,8 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, * peer's lp_peer_nets list, and the peer NI for the primary NID should * be the first entry in its peer net's lpn_peer_nis list. */ - lpni = lnet_find_peer_ni_locked(pbuf->pb_info.pi_ni[1].ns_nid); + lnet_nid4_to_nid(pbuf->pb_info.pi_ni[1].ns_nid, &nid); + lpni = lnet_peer_ni_find_locked(&nid); if (!lpni) { CERROR("Internal error: Failed to lookup peer NI for primary NID: %s\n", libcfs_nid2str(pbuf->pb_info.pi_ni[1].ns_nid)); @@ -3286,7 +3224,7 @@ static int lnet_peer_data_present(struct lnet_peer *lp) { struct lnet_ping_buffer *pbuf; struct lnet_peer_ni *lpni; - lnet_nid_t nid = LNET_NID_ANY; + struct lnet_nid nid; unsigned int flags; int rc = 0; @@ -3344,9 +3282,9 @@ static int lnet_peer_data_present(struct lnet_peer *lp) lnet_ping_buffer_decref(pbuf); goto out; } - nid = pbuf->pb_info.pi_ni[1].ns_nid; + lnet_nid4_to_nid(pbuf->pb_info.pi_ni[1].ns_nid, &nid); if (nid_is_lo0(&lp->lp_primary_nid)) { - rc = lnet_peer_set_primary_nid(lp, nid, flags); + rc = lnet_peer_set_primary_nid(lp, &nid, flags); if (rc) lnet_ping_buffer_decref(pbuf); else @@ -3358,19 +3296,19 @@ static int lnet_peer_data_present(struct lnet_peer *lp) * to update the status of the nids that we currently have * recorded in that peer. */ - } else if (lnet_nid_to_nid4(&lp->lp_primary_nid) == nid || + } else if (nid_same(&lp->lp_primary_nid, &nid) || (lnet_is_nid_in_ping_info(lnet_nid_to_nid4(&lp->lp_primary_nid), &pbuf->pb_info) && lnet_is_discovery_disabled(lp))) { rc = lnet_peer_merge_data(lp, pbuf); } else { - lpni = lnet_find_peer_ni_locked(nid); + lpni = lnet_peer_ni_find_locked(&nid); if (!lpni || lp == lpni->lpni_peer_net->lpn_peer) { - rc = lnet_peer_set_primary_nid(lp, nid, flags); + rc = lnet_peer_set_primary_nid(lp, &nid, flags); if (rc) { CERROR("Primary NID error %s versus %s: %d\n", libcfs_nidstr(&lp->lp_primary_nid), - libcfs_nid2str(nid), rc); + libcfs_nidstr(&nid), rc); lnet_ping_buffer_decref(pbuf); } else { rc = lnet_peer_merge_data(lp, pbuf); @@ -3939,19 +3877,21 @@ void lnet_peer_discovery_stop(void) /* Debugging */ void -lnet_debug_peer(lnet_nid_t nid) +lnet_debug_peer(lnet_nid_t nid4) { char *aliveness = "NA"; struct lnet_peer_ni *lp; + struct lnet_nid nid; int cpt; - cpt = lnet_cpt_of_nid(nid, NULL); + lnet_nid4_to_nid(nid4, &nid); + cpt = lnet_nid2cpt(&nid, NULL); lnet_net_lock(cpt); - lp = lnet_nid2peerni_locked(nid, LNET_NID_ANY, cpt); + lp = lnet_peerni_by_nid_locked(&nid, NULL, cpt); if (IS_ERR(lp)) { lnet_net_unlock(cpt); - CDEBUG(D_WARNING, "No peer %s\n", libcfs_nid2str(nid)); + CDEBUG(D_WARNING, "No peer %s\n", libcfs_nidstr(&nid)); return; } @@ -4046,18 +3986,19 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk) struct lnet_peer_ni_credit_info *lpni_info; struct lnet_peer_ni *lpni; struct lnet_peer *lp; - lnet_nid_t nid; + struct lnet_nid nid; + lnet_nid_t nid4; u32 size; int rc; - lp = lnet_find_peer4(cfg->prcfg_prim_nid); - + lnet_nid4_to_nid(cfg->prcfg_prim_nid, &nid); + lp = lnet_find_peer(&nid); if (!lp) { rc = -ENOENT; goto out; } - size = sizeof(nid) + sizeof(*lpni_info) + sizeof(*lpni_stats) + + size = sizeof(nid4) + sizeof(*lpni_info) + sizeof(*lpni_stats) + sizeof(*lpni_msg_stats) + sizeof(*lpni_hstats); size *= lp->lp_nnis; if (size > cfg->prcfg_size) { @@ -4094,10 +4035,10 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk) while ((lpni = lnet_get_next_peer_ni_locked(lp, NULL, lpni)) != NULL) { if (!nid_is_nid4(&lpni->lpni_nid)) continue; - nid = lnet_nid_to_nid4(&lpni->lpni_nid); - if (copy_to_user(bulk, &nid, sizeof(nid))) + nid4 = lnet_nid_to_nid4(&lpni->lpni_nid); + if (copy_to_user(bulk, &nid4, sizeof(nid4))) goto out_free_hstats; - bulk += sizeof(nid); + bulk += sizeof(nid4); memset(lpni_info, 0, sizeof(*lpni_info)); snprintf(lpni_info->cr_aliveness, LNET_MAX_STR_LEN, "NA"); @@ -4218,12 +4159,13 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk) /* Call with the ln_api_mutex held */ void -lnet_peer_ni_set_healthv(lnet_nid_t nid, int value, bool all) +lnet_peer_ni_set_healthv(lnet_nid_t nid4, int value, bool all) { struct lnet_peer_table *ptable; struct lnet_peer *lp; struct lnet_peer_net *lpn; struct lnet_peer_ni *lpni; + struct lnet_nid nid; int lncpt; int cpt; time64_t now; @@ -4231,11 +4173,12 @@ int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk) if (the_lnet.ln_state != LNET_STATE_RUNNING) return; + lnet_nid4_to_nid(nid4, &nid); now = ktime_get_seconds(); if (!all) { lnet_net_lock(LNET_LOCK_EX); - lpni = lnet_find_peer_ni_locked(nid); + lpni = lnet_peer_ni_find_locked(&nid); if (!lpni) { lnet_net_unlock(LNET_LOCK_EX); return; diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c index b4f7aaa..bbef2b3 100644 --- a/net/lnet/lnet/router.c +++ b/net/lnet/lnet/router.c @@ -1199,8 +1199,7 @@ bool lnet_router_checker_active(void) spin_unlock(&rtr->lp_lock); /* find the peer_ni associated with the primary NID */ - lpni = lnet_peer_get_ni_locked(rtr, - lnet_nid_to_nid4(&rtr->lp_primary_nid)); + lpni = lnet_peer_ni_get_locked(rtr, &rtr->lp_primary_nid); if (!lpni) { CDEBUG(D_NET, "Expected to find an lpni for %s, but non found\n", @@ -1701,25 +1700,26 @@ bool lnet_router_checker_active(void) * when: notificaiton time. */ int -lnet_notify(struct lnet_ni *ni, lnet_nid_t nid, bool alive, bool reset, +lnet_notify(struct lnet_ni *ni, lnet_nid_t nid4, bool alive, bool reset, time64_t when) { struct lnet_peer_ni *lpni = NULL; struct lnet_route *route; struct lnet_peer *lp; time64_t now = ktime_get_seconds(); + struct lnet_nid nid; int cpt; LASSERT(!in_interrupt()); CDEBUG(D_NET, "%s notifying %s: %s\n", !ni ? "userspace" : libcfs_nidstr(&ni->ni_nid), - libcfs_nid2str(nid), alive ? "up" : "down"); + libcfs_nidstr(&nid), alive ? "up" : "down"); if (ni && - LNET_NID_NET(&ni->ni_nid) != LNET_NIDNET(nid)) { + LNET_NID_NET(&ni->ni_nid) != LNET_NID_NET(&nid)) { CWARN("Ignoring notification of %s %s by %s (different net)\n", - libcfs_nid2str(nid), alive ? "birth" : "death", + libcfs_nidstr(&nid), alive ? "birth" : "death", libcfs_nidstr(&ni->ni_nid)); return -EINVAL; } @@ -1728,7 +1728,7 @@ bool lnet_router_checker_active(void) if (when > now) { CWARN("Ignoring prediction from %s of %s %s %lld seconds in the future\n", ni ? libcfs_nidstr(&ni->ni_nid) : "userspace", - libcfs_nid2str(nid), alive ? "up" : "down", when - now); + libcfs_nidstr(&nid), alive ? "up" : "down", when - now); return -EINVAL; } @@ -1746,11 +1746,11 @@ bool lnet_router_checker_active(void) return -ESHUTDOWN; } - lpni = lnet_find_peer_ni_locked(nid); + lpni = lnet_peer_ni_find_locked(&nid); if (!lpni) { /* nid not found */ lnet_net_unlock(0); - CDEBUG(D_NET, "%s not found\n", libcfs_nid2str(nid)); + CDEBUG(D_NET, "%s not found\n", libcfs_nidstr(&nid)); return 0; } diff --git a/net/lnet/lnet/udsp.c b/net/lnet/lnet/udsp.c index 7fa4f88..2594df1 100644 --- a/net/lnet/lnet/udsp.c +++ b/net/lnet/lnet/udsp.c @@ -1052,17 +1052,19 @@ struct lnet_udsp * { struct lnet_ni *ni; struct lnet_peer_ni *lpni; + struct lnet_nid nid; lnet_net_lock(0); + lnet_nid4_to_nid(info->cud_nid, &nid); if (!info->cud_peer) { - ni = lnet_nid2ni_locked(info->cud_nid, 0); + ni = lnet_nid_to_ni_locked(&nid, 0); if (ni) lnet_udsp_get_ni_info(info, ni); } else { - lpni = lnet_find_peer_ni_locked(info->cud_nid); + lpni = lnet_peer_ni_find_locked(&nid); if (!lpni) { CDEBUG(D_NET, "nid %s is not found\n", - libcfs_nid2str(info->cud_nid)); + libcfs_nidstr(&nid)); } else { lnet_udsp_get_peer_info(info, lpni); lnet_peer_ni_decref_locked(lpni); From patchwork Thu Aug 4 01:37:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935998 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5BDCFC19F29 for ; Thu, 4 Aug 2022 01:40:07 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyryG5XK0z231H; Wed, 3 Aug 2022 18:40:06 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwW148Nz23JV for ; Wed, 3 Aug 2022 18:38:35 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id C66FF100B003; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C49BF905FD; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:58 -0400 Message-Id: <1659577097-19253-14-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 13/32] lnet: change lnet_*_peer_ni to take struct lnet_nid X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown lnet_add_peer_ni() and lnet_del_peer_ni() now take struct lnet_nid rather than lnet_nid_t. WC-bug-id: https://jira.whamcloud.com/browse/LU-10391 Lustre-commit: d9af9b5a7ee706660 ("LU-10391 lnet: change lnet_*_peer_ni to take struct lnet_nid") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/44625 Reviewed-by: James Simmons Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-lnet.h | 5 +-- net/lnet/lnet/api-ni.c | 14 +++++--- net/lnet/lnet/peer.c | 74 +++++++++++++++++++++---------------------- 3 files changed, 49 insertions(+), 44 deletions(-) diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index 3bdb49e..5a83190 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -930,8 +930,9 @@ bool lnet_peer_is_pref_rtr_locked(struct lnet_peer_ni *lpni, int lnet_peer_add_pref_rtr(struct lnet_peer_ni *lpni, struct lnet_nid *nid); int lnet_peer_ni_set_non_mr_pref_nid(struct lnet_peer_ni *lpni, struct lnet_nid *nid); -int lnet_add_peer_ni(lnet_nid_t key_nid, lnet_nid_t nid, bool mr, bool temp); -int lnet_del_peer_ni(lnet_nid_t key_nid, lnet_nid_t nid); +int lnet_add_peer_ni(struct lnet_nid *key_nid, struct lnet_nid *nid, bool mr, + bool temp); +int lnet_del_peer_ni(struct lnet_nid *key_nid, struct lnet_nid *nid); int lnet_get_peer_info(struct lnet_ioctl_peer_cfg *cfg, void __user *bulk); int lnet_get_peer_ni_info(u32 peer_index, u64 *nid, char alivness[LNET_MAX_STR_LEN], diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index 124ec86..7c94d16 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -4176,27 +4176,31 @@ u32 lnet_get_dlc_seq_locked(void) case IOC_LIBCFS_ADD_PEER_NI: { struct lnet_ioctl_peer_cfg *cfg = arg; + struct lnet_nid prim_nid; if (cfg->prcfg_hdr.ioc_len < sizeof(*cfg)) return -EINVAL; mutex_lock(&the_lnet.ln_api_mutex); - rc = lnet_add_peer_ni(cfg->prcfg_prim_nid, - cfg->prcfg_cfg_nid, - cfg->prcfg_mr, false); + lnet_nid4_to_nid(cfg->prcfg_prim_nid, &prim_nid); + lnet_nid4_to_nid(cfg->prcfg_cfg_nid, &nid); + rc = lnet_add_peer_ni(&prim_nid, &nid, cfg->prcfg_mr, false); mutex_unlock(&the_lnet.ln_api_mutex); return rc; } case IOC_LIBCFS_DEL_PEER_NI: { struct lnet_ioctl_peer_cfg *cfg = arg; + struct lnet_nid prim_nid; if (cfg->prcfg_hdr.ioc_len < sizeof(*cfg)) return -EINVAL; mutex_lock(&the_lnet.ln_api_mutex); - rc = lnet_del_peer_ni(cfg->prcfg_prim_nid, - cfg->prcfg_cfg_nid); + lnet_nid4_to_nid(cfg->prcfg_prim_nid, &prim_nid); + lnet_nid4_to_nid(cfg->prcfg_cfg_nid, &nid); + rc = lnet_del_peer_ni(&prim_nid, + &nid); mutex_unlock(&the_lnet.ln_api_mutex); return rc; } diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c index 7a96a2f..8d81a7d 100644 --- a/net/lnet/lnet/peer.c +++ b/net/lnet/lnet/peer.c @@ -519,15 +519,14 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp) * -EBUSY: The lnet_peer_ni is the primary, and not the only peer_ni. */ static int -lnet_peer_del_nid(struct lnet_peer *lp, lnet_nid_t nid4, unsigned int flags) +lnet_peer_del_nid(struct lnet_peer *lp, struct lnet_nid *nid, + unsigned int flags) { struct lnet_peer_ni *lpni; struct lnet_nid primary_nid = lp->lp_primary_nid; - struct lnet_nid nid; int rc = 0; bool force = (flags & LNET_PEER_RTR_NI_FORCE_DEL) ? true : false; - lnet_nid4_to_nid(nid4, &nid); if (!(flags & LNET_PEER_CONFIGURED)) { if (lp->lp_state & LNET_PEER_CONFIGURED) { rc = -EPERM; @@ -535,7 +534,7 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp) } } - lpni = lnet_peer_ni_find_locked(&nid); + lpni = lnet_peer_ni_find_locked(nid); if (!lpni) { rc = -ENOENT; goto out; @@ -550,14 +549,14 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp) * This function only allows deletion of the primary NID if it * is the only NID. */ - if (nid_same(&nid, &lp->lp_primary_nid) && lp->lp_nnis != 1 && !force) { + if (nid_same(nid, &lp->lp_primary_nid) && lp->lp_nnis != 1 && !force) { rc = -EBUSY; goto out; } lnet_net_lock(LNET_LOCK_EX); - if (nid_same(&nid, &lp->lp_primary_nid) && lp->lp_nnis != 1 && force) { + if (nid_same(nid, &lp->lp_primary_nid) && lp->lp_nnis != 1 && force) { struct lnet_peer_ni *lpni2; /* assign the next peer_ni to be the primary */ lpni2 = lnet_get_next_peer_ni_locked(lp, NULL, lpni); @@ -570,7 +569,7 @@ static void lnet_peer_cancel_discovery(struct lnet_peer *lp) out: CDEBUG(D_NET, "peer %s NID %s flags %#x: %d\n", - libcfs_nidstr(&primary_nid), libcfs_nidstr(&nid), + libcfs_nidstr(&primary_nid), libcfs_nidstr(nid), flags, rc); return rc; @@ -1333,7 +1332,7 @@ struct lnet_peer_ni * int LNetAddPeer(lnet_nid_t *nids, u32 num_nids) { - lnet_nid_t pnid = 0; + struct lnet_nid pnid = LNET_ANY_NID; bool mr; int i, rc; @@ -1350,16 +1349,21 @@ struct lnet_peer_ni * rc = 0; for (i = 0; i < num_nids; i++) { + struct lnet_nid nid; + if (nids[i] == LNET_NID_LO_0) continue; - if (!pnid) { - pnid = nids[i]; - rc = lnet_add_peer_ni(pnid, LNET_NID_ANY, mr, true); + lnet_nid4_to_nid(nids[i], &nid); + if (LNET_NID_IS_ANY(&pnid)) { + lnet_nid4_to_nid(nids[i], &pnid); + rc = lnet_add_peer_ni(&pnid, &LNET_ANY_NID, mr, true); } else if (lnet_peer_discovery_disabled) { - rc = lnet_add_peer_ni(nids[i], LNET_NID_ANY, mr, true); + lnet_nid4_to_nid(nids[i], &nid); + rc = lnet_add_peer_ni(&nid, &LNET_ANY_NID, mr, true); } else { - rc = lnet_add_peer_ni(pnid, nids[i], mr, true); + lnet_nid4_to_nid(nids[i], &nid); + rc = lnet_add_peer_ni(&pnid, &nid, mr, true); } if (rc && rc != -EEXIST) @@ -1849,20 +1853,17 @@ struct lnet_peer_net * * being created/modified/deleted by a different thread. */ int -lnet_add_peer_ni(lnet_nid_t prim_nid4, lnet_nid_t nid4, bool mr, bool temp) +lnet_add_peer_ni(struct lnet_nid *prim_nid, struct lnet_nid *nid, bool mr, + bool temp) { - struct lnet_nid prim_nid, nid; struct lnet_peer *lp = NULL; struct lnet_peer_ni *lpni; unsigned int flags = 0; /* The prim_nid must always be specified */ - if (prim_nid4 == LNET_NID_ANY) + if (LNET_NID_IS_ANY(prim_nid)) return -EINVAL; - lnet_nid4_to_nid(prim_nid4, &prim_nid); - lnet_nid4_to_nid(nid4, &nid); - if (!temp) flags = LNET_PEER_CONFIGURED; @@ -1873,11 +1874,11 @@ struct lnet_peer_net * * If nid isn't specified, we must create a new peer with * prim_nid as its primary nid. */ - if (nid4 == LNET_NID_ANY) - return lnet_peer_add(&prim_nid, flags); + if (LNET_NID_IS_ANY(nid)) + return lnet_peer_add(prim_nid, flags); /* Look up the prim_nid, which must exist. */ - lpni = lnet_peer_ni_find_locked(&prim_nid); + lpni = lnet_peer_ni_find_locked(prim_nid); if (!lpni) return -ENOENT; lnet_peer_ni_decref_locked(lpni); @@ -1886,14 +1887,14 @@ struct lnet_peer_net * /* Peer must have been configured. */ if (!temp && !(lp->lp_state & LNET_PEER_CONFIGURED)) { CDEBUG(D_NET, "peer %s was not configured\n", - libcfs_nidstr(&prim_nid)); + libcfs_nidstr(prim_nid)); return -ENOENT; } /* Primary NID must match */ - if (!nid_same(&lp->lp_primary_nid, &prim_nid)) { + if (!nid_same(&lp->lp_primary_nid, prim_nid)) { CDEBUG(D_NET, "prim_nid %s is not primary for peer %s\n", - libcfs_nidstr(&prim_nid), + libcfs_nidstr(prim_nid), libcfs_nidstr(&lp->lp_primary_nid)); return -ENODEV; } @@ -1901,11 +1902,11 @@ struct lnet_peer_net * /* Multi-Rail flag must match. */ if ((lp->lp_state ^ flags) & LNET_PEER_MULTI_RAIL) { CDEBUG(D_NET, "multi-rail state mismatch for peer %s\n", - libcfs_nidstr(&prim_nid)); + libcfs_nidstr(prim_nid)); return -EPERM; } - return lnet_peer_add_nid(lp, &nid, flags); + return lnet_peer_add_nid(lp, nid, flags); } /* @@ -1920,26 +1921,24 @@ struct lnet_peer_net * * being modified/deleted by a different thread. */ int -lnet_del_peer_ni(lnet_nid_t prim_nid4, lnet_nid_t nid) +lnet_del_peer_ni(struct lnet_nid *prim_nid, struct lnet_nid *nid) { struct lnet_peer *lp; struct lnet_peer_ni *lpni; unsigned int flags; - struct lnet_nid prim_nid; - if (prim_nid4 == LNET_NID_ANY) + if (!prim_nid || LNET_NID_IS_ANY(prim_nid)) return -EINVAL; - lnet_nid4_to_nid(prim_nid4, &prim_nid); - lpni = lnet_peer_ni_find_locked(&prim_nid); + lpni = lnet_peer_ni_find_locked(prim_nid); if (!lpni) return -ENOENT; lnet_peer_ni_decref_locked(lpni); lp = lpni->lpni_peer_net->lpn_peer; - if (!nid_same(&prim_nid, &lp->lp_primary_nid)) { + if (!nid_same(prim_nid, &lp->lp_primary_nid)) { CDEBUG(D_NET, "prim_nid %s is not primary for peer %s\n", - libcfs_nidstr(&prim_nid), + libcfs_nidstr(prim_nid), libcfs_nidstr(&lp->lp_primary_nid)); return -ENODEV; } @@ -1948,12 +1947,12 @@ struct lnet_peer_net * if (lp->lp_rtr_refcount > 0) { lnet_net_unlock(LNET_LOCK_EX); CERROR("%s is a router. Can not be deleted\n", - libcfs_nidstr(&prim_nid)); + libcfs_nidstr(prim_nid)); return -EBUSY; } lnet_net_unlock(LNET_LOCK_EX); - if (nid == LNET_NID_ANY || nid == lnet_nid_to_nid4(&lp->lp_primary_nid)) + if (LNET_NID_IS_ANY(nid) || nid_same(nid, &lp->lp_primary_nid)) return lnet_peer_del(lp); flags = LNET_PEER_CONFIGURED; @@ -3011,9 +3010,10 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, * being told that the router changed its primary_nid * then it's okay to delete it. */ + lnet_nid4_to_nid(delnis[i], &nid); if (lp->lp_rtr_refcount > 0) flags |= LNET_PEER_RTR_NI_FORCE_DEL; - rc = lnet_peer_del_nid(lp, delnis[i], flags); + rc = lnet_peer_del_nid(lp, &nid, flags); if (rc) { CERROR("Error deleting NID %s from peer %s: %d\n", libcfs_nid2str(delnis[i]), From patchwork Thu Aug 4 01:37:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935985 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5BC86C19F2D for ; Thu, 4 Aug 2022 01:39:15 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyrxG6bnxz23HT; Wed, 3 Aug 2022 18:39:14 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwW6ZcTz23JV for ; Wed, 3 Aug 2022 18:38:35 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id CA9C7100B004; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C7CE282CCE; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:37:59 -0400 Message-Id: <1659577097-19253-15-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 14/32] lnet: Ensure round robin across nets X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn Introduce a global net sequence number and a peer sequence number. These sequence numbers are used to ensure round robin selection of local NIs and peer NIs across nets. Also consolidate the sequence number accounting under lnet_handle_send(). Previously the sequence number increment for the final destination peer net/peer NI on a routed send was done in lnet_handle_find_routed_path(). Some cleanup that is also in this patch: - Redundant check of null src_nid is removed from lnet_handle_find_routed_path() (LNET_NID_IS_ANY handles null arg) - Avoid comparing best_lpn with itself in lnet_handle_find_routed_path() on the first loop iteration - In lnet_find_best_ni_on_local_net() check whether we have a specified lp_disc_net_id outside of the loop to avoid doing that work on each loop iteration. Added some debug statements to print information used when selecting peer net/local net. HPE-bug-id: LUS-10871 WC-bug-id: https://jira.whamcloud.com/browse/LU-15713 Lustre-commit: 05413b3d84f7d1feb ("LU-15713 lnet: Ensure round robin across nets") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/46976 Reviewed-by: Serguei Smirnov Reviewed-by: Cyril Bordage Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-types.h | 11 ++++- net/lnet/lnet/lib-move.c | 96 +++++++++++++++++++++++++++--------------- 2 files changed, 72 insertions(+), 35 deletions(-) diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h index 1827f4e..09b9d8e 100644 --- a/include/linux/lnet/lib-types.h +++ b/include/linux/lnet/lib-types.h @@ -765,6 +765,11 @@ struct lnet_peer { /* cached peer aliveness */ bool lp_alive; + + /* sequence number used to round robin traffic to this peer's + * nets/NIs + */ + u32 lp_send_seq; }; /* @@ -1205,10 +1210,12 @@ struct lnet { /* LND instances */ struct list_head ln_nets; - /* network zombie list */ - struct list_head ln_net_zombie; + /* Sequence number used to round robin sends across all nets */ + u32 ln_net_seq; /* the loopback NI */ struct lnet_ni *ln_loni; + /* network zombie list */ + struct list_head ln_net_zombie; /* resend messages list */ struct list_head ln_msg_resend; /* spin lock to protect the msg resend list */ diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index a514472..6ad0963 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -1658,9 +1658,12 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, * local ni and local net so that we pick the next ones * in Round Robin. */ - best_lpni->lpni_peer_net->lpn_seq++; + best_lpni->lpni_peer_net->lpn_peer->lp_send_seq++; + best_lpni->lpni_peer_net->lpn_seq = + best_lpni->lpni_peer_net->lpn_peer->lp_send_seq; best_lpni->lpni_seq = best_lpni->lpni_peer_net->lpn_seq; - best_ni->ni_net->net_seq++; + the_lnet.ln_net_seq++; + best_ni->ni_net->net_seq = the_lnet.ln_net_seq; best_ni->ni_seq = best_ni->ni_net->net_seq; CDEBUG(D_NET, @@ -1743,6 +1746,11 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, * lnet_select_pathway() function and is never changed. * It's safe to use it here. */ + final_dst_lpni->lpni_peer_net->lpn_peer->lp_send_seq++; + final_dst_lpni->lpni_peer_net->lpn_seq = + final_dst_lpni->lpni_peer_net->lpn_peer->lp_send_seq; + final_dst_lpni->lpni_seq = + final_dst_lpni->lpni_peer_net->lpn_seq; msg->msg_hdr.dest_nid = final_dst_lpni->lpni_nid; } else { /* if we're not routing set the dest_nid to the best peer @@ -1968,8 +1976,10 @@ struct lnet_ni * int best_lpn_healthv = 0; u32 best_lpn_sel_prio = LNET_MAX_SELECTION_PRIORITY; - CDEBUG(D_NET, "using src nid %s for route restriction\n", - src_nid ? libcfs_nidstr(src_nid) : "ANY"); + CDEBUG(D_NET, "%s route (%s) from local NI %s to destination %s\n", + LNET_NID_IS_ANY(&sd->sd_rtr_nid) ? "Lookup" : "Specified", + libcfs_nidstr(&sd->sd_rtr_nid), libcfs_nidstr(src_nid), + libcfs_nidstr(&sd->sd_dst_nid)); /* If a router nid was specified then we are replying to a GET or * sending an ACK. In this case we use the gateway associated with the @@ -1989,8 +1999,7 @@ struct lnet_ni * } if (!route_found) { - if (sd->sd_msg->msg_routing || - (src_nid && !LNET_NID_IS_ANY(src_nid))) { + if (sd->sd_msg->msg_routing || !LNET_NID_IS_ANY(src_nid)) { /* If I'm routing this message then I need to find the * next hop based on the destination NID * @@ -2006,6 +2015,8 @@ struct lnet_ni * libcfs_nidstr(&sd->sd_dst_nid)); return -EHOSTUNREACH; } + CDEBUG(D_NET, "best_rnet %s\n", + libcfs_net2str(best_rnet->lrn_net)); } else { /* we've already looked up the initial lpni using * dst_nid @@ -2023,10 +2034,18 @@ struct lnet_ni * if (!rnet) continue; - if (!best_lpn) { - best_lpn = lpn; - best_rnet = rnet; - } + if (!best_lpn) + goto use_lpn; + else + CDEBUG(D_NET, "n[%s, %s] h[%d, %d], p[%u, %u], s[%d, %d]\n", + libcfs_net2str(lpn->lpn_net_id), + libcfs_net2str(best_lpn->lpn_net_id), + lpn->lpn_healthv, + best_lpn->lpn_healthv, + lpn->lpn_sel_priority, + best_lpn->lpn_sel_priority, + lpn->lpn_seq, + best_lpn->lpn_seq); /* select the preferred peer net */ if (best_lpn_healthv > lpn->lpn_healthv) @@ -2054,6 +2073,9 @@ struct lnet_ni * return -EHOSTUNREACH; } + CDEBUG(D_NET, "selected best_lpn %s\n", + libcfs_net2str(best_lpn->lpn_net_id)); + sd->sd_best_lpni = lnet_find_best_lpni(sd->sd_best_ni, lnet_nid_to_nid4(&sd->sd_dst_nid), lp, @@ -2068,12 +2090,6 @@ struct lnet_ni * * NI's so update the final destination we selected */ sd->sd_final_dst_lpni = sd->sd_best_lpni; - - /* Increment the sequence number of the remote lpni so - * we can round robin over the different interfaces of - * the remote lpni - */ - sd->sd_best_lpni->lpni_seq++; } /* find the best route. Restrict the selection on the net of the @@ -2139,14 +2155,12 @@ struct lnet_ni * *gw_lpni = gwni; *gw_peer = gw; - /* increment the sequence numbers since now we're sure we're - * going to use this path + /* increment the sequence number since now we're sure we're + * going to use this route */ if (LNET_NID_IS_ANY(&sd->sd_rtr_nid)) { LASSERT(best_route && last_route); best_route->lr_seq = last_route->lr_seq + 1; - if (best_lpn) - best_lpn->lpn_seq++; } return 0; @@ -2220,7 +2234,15 @@ struct lnet_ni * u32 lpn_sel_prio; u32 best_net_sel_prio = LNET_MAX_SELECTION_PRIORITY; u32 net_sel_prio; - bool exit = false; + + /* if this is a discovery message and lp_disc_net_id is + * specified then use that net to send the discovery on. + */ + if (discovery && peer->lp_disc_net_id) { + best_lpn = lnet_peer_get_net_locked(peer, peer->lp_disc_net_id); + if (best_lpn && lnet_get_net_locked(best_lpn->lpn_net_id)) + goto select_best_ni; + } /* The peer can have multiple interfaces, some of them can be on * the local network and others on a routed network. We should @@ -2241,17 +2263,25 @@ struct lnet_ni * net_healthv = lnet_get_net_healthv_locked(net); net_sel_prio = net->net_sel_priority; - /* if this is a discovery message and lp_disc_net_id is - * specified then use that net to send the discovery on. - */ - if (peer->lp_disc_net_id == lpn->lpn_net_id && - discovery) { - exit = true; - goto select_lpn; - } - if (!best_lpn) goto select_lpn; + else + CDEBUG(D_NET, + "n[%s, %s] ph[%d, %d], pp[%u, %u], nh[%d, %d], np[%u, %u], ps[%u, %u], ns[%u, %u]\n", + libcfs_net2str(lpn->lpn_net_id), + libcfs_net2str(best_lpn->lpn_net_id), + lpn->lpn_healthv, + best_lpn_healthv, + lpn_sel_prio, + best_lpn_sel_prio, + net_healthv, + best_net_healthv, + net_sel_prio, + best_net_sel_prio, + lpn->lpn_seq, + best_lpn->lpn_seq, + net->net_seq, + best_net->net_seq); /* always select the lpn with the best health */ if (best_lpn_healthv > lpn->lpn_healthv) @@ -2291,15 +2321,15 @@ struct lnet_ni * best_lpn_sel_prio = lpn_sel_prio; best_lpn = lpn; best_net = net; - - if (exit) - break; } if (best_lpn) { /* Select the best NI on the same net as best_lpn chosen * above */ +select_best_ni: + CDEBUG(D_NET, "selected best_lpn %s\n", + libcfs_net2str(best_lpn->lpn_net_id)); best_ni = lnet_find_best_ni_on_spec_net(NULL, peer, best_lpn, msg, md_cpt); } From patchwork Thu Aug 4 01:38:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12936000 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 55A0FC19F29 for ; Thu, 4 Aug 2022 01:40:11 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyryL6J2Gz23Ld; Wed, 3 Aug 2022 18:40:10 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwX54J0z23JV for ; Wed, 3 Aug 2022 18:38:36 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id CD2AC100B005; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id CAE2F8BBFC; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:00 -0400 Message-Id: <1659577097-19253-16-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 15/32] lustre: llite: dont restart directIO with IOCB_NOWAIT X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin It should handle FLR mirror retry and io_uring with IOCB_NOWAIT flag differently. int cl_io_loop(const struct lu_env *env, struct cl_io *io) { ... if (result == -EAGAIN && io->ci_ndelay) { io->ci_need_restart = 1; result = 0; } ... } ssize_t generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) { ... if (iocb->ki_flags & IOCB_NOWAIT) { if (filemap_range_has_page(mapping, iocb->ki_pos, iocb->ki_pos + count - 1)) return -EAGAIN; ... } In current code, it will restart I/O engine for read when get -EAGAIN code. However, for io_uring direct IO with IOCB_NOWAIT, if found that there are cache pages in the current I/O range, it should return -EAGAIN to the upper layer immediately. Otherwise, it will stuck in an endless loop. This patch also adds a tool "io_uring_probe" to check whether the kernel supports io_uring fully. The reason adding this check is because the rhel8.5 kernel has backported io_uring: cat /proc/kallsyms |grep io_uring ffffffffa8510e10 W __x64_sys_io_uring_enter ffffffffa8510e10 W __x64_sys_io_uring_register ffffffffa8510e10 W __x64_sys_io_uring_setup but the io_uring syscalls return -ENOSYS. WC-bug-id: https://jira.whamcloud.com/browse/LU-15399 Lustre-commit: 8db455c77265063a1 ("LU-15399 llite: dont restart directIO with IOCB_NOWAIT") Signed-off-by: Qian Yingjin Reviewed-on: https://review.whamcloud.com/46147 Reviewed-by: Andreas Dilger Reviewed-by: Patrick Farrell Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 6 +++++- fs/lustre/llite/file.c | 6 ++++++ fs/lustre/obdclass/cl_io.c | 2 +- 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index c66e98c5..c717d03 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -1960,7 +1960,11 @@ struct cl_io { /** * Bypass quota check */ - unsigned int ci_noquota:1; + unsigned int ci_noquota:1, + /** + * io_uring direct IO with flags IOCB_NOWAIT. + */ + ci_iocb_nowait:1; /** * How many times the read has retried before this one. * Set by the top level and consumed by the LOV. diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 0e71b3a..3aace07 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -1587,6 +1587,12 @@ void ll_io_init(struct cl_io *io, const struct file *file, int write, IOCB_DSYNC)); } +#ifdef IOCB_NOWAIT + io->ci_iocb_nowait = !!(args && + (args->u.normal.via_iocb->ki_flags & + IOCB_NOWAIT)); +#endif + io->ci_obj = ll_i2info(inode)->lli_clob; io->ci_lockreq = CILR_MAYBE; if (ll_file_nolock(file)) { diff --git a/fs/lustre/obdclass/cl_io.c b/fs/lustre/obdclass/cl_io.c index 4246e17..c388700 100644 --- a/fs/lustre/obdclass/cl_io.c +++ b/fs/lustre/obdclass/cl_io.c @@ -776,7 +776,7 @@ int cl_io_loop(const struct lu_env *env, struct cl_io *io) if (rc && !result) result = rc; - if (result == -EAGAIN && io->ci_ndelay) { + if (result == -EAGAIN && io->ci_ndelay && !io->ci_iocb_nowait) { io->ci_need_restart = 1; result = 0; } From patchwork Thu Aug 4 01:38:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935989 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B8A3C19F29 for ; Thu, 4 Aug 2022 01:39:45 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrxs07DLz23Jv; Wed, 3 Aug 2022 18:39:44 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwY3PkBz23JG for ; Wed, 3 Aug 2022 18:38:37 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id CFE62100B006; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id CE4ED80795; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:01 -0400 Message-Id: <1659577097-19253-17-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 16/32] lustre: sec: handle read-only flag X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Sebastien Buisson Add a new 'readonly_mount' property to nodemaps on the server side. When this property is set, we return -EROFS from server side if the client is not mounting read-only. So the client will have to specify the read-only mount option to be allowed to mount. WC-bug-id: https://jira.whamcloud.com/browse/LU-15451 Lustre-commit: e7ce67de92dea6870 ("LU-15451 sec: read-only nodemap flag") Signed-off-by: Sebastien Buisson Reviewed-on: https://review.whamcloud.com/46149 Reviewed-by: Andreas Dilger Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ptlrpc/import.c | 4 ++-- fs/lustre/ptlrpc/niobuf.c | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/lustre/ptlrpc/import.c b/fs/lustre/ptlrpc/import.c index a5fdb8a8..697b3c3 100644 --- a/fs/lustre/ptlrpc/import.c +++ b/fs/lustre/ptlrpc/import.c @@ -1306,10 +1306,10 @@ static int ptlrpc_connect_interpret(const struct lu_env *env, time64_t next_connect; import_set_state_nolock(imp, LUSTRE_IMP_DISCON); - if (rc == -EACCES) { + if (rc == -EACCES || rc == -EROFS) { /* * Give up trying to reconnect - * EACCES means client has no permission for connection + * EROFS means client must mount read-only */ imp->imp_obd->obd_no_recov = 1; ptlrpc_deactivate_import_nolock(imp); diff --git a/fs/lustre/ptlrpc/niobuf.c b/fs/lustre/ptlrpc/niobuf.c index 94a0329..be1811a 100644 --- a/fs/lustre/ptlrpc/niobuf.c +++ b/fs/lustre/ptlrpc/niobuf.c @@ -472,7 +472,8 @@ int ptlrpc_send_error(struct ptlrpc_request *req, int may_be_difficult) if (req->rq_status != -ENOSPC && req->rq_status != -EACCES && req->rq_status != -EPERM && req->rq_status != -ENOENT && - req->rq_status != -EINPROGRESS && req->rq_status != -EDQUOT) + req->rq_status != -EINPROGRESS && req->rq_status != -EDQUOT && + req->rq_status != -EROFS) req->rq_type = PTL_RPC_MSG_ERR; rc = ptlrpc_send_reply(req, may_be_difficult); From patchwork Thu Aug 4 01:38:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935986 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0CD8FC19F29 for ; Thu, 4 Aug 2022 01:39:19 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyrxL4xqfz23Jw; Wed, 3 Aug 2022 18:39:18 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4LyrwZ2bkPz23Jy for ; Wed, 3 Aug 2022 18:38:38 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id D3553100B007; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id D16DC8D620; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:02 -0400 Message-Id: <1659577097-19253-18-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 17/32] lustre: llog: Add LLOG_SKIP_PLAIN to skip llog plain X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Etienne AUJAMES , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Etienne AUJAMES Add the catalog callback return LLOG_SKIP_PLAIN to conditionally skip an entire llog plain. This could speedup the catalog processing for specific usages when a record need to be access in the "middle" of the catalog. This could be useful for changelog with several users or HSM. This patch modify chlg_read_cat_process_cb() to use LLOG_SKIP_PLAIN. The main idea came from: d813c75d ("LU-14688 mdt: changelog purge deletes plain llog") **Performance test:** * Environment: 2474195 changelogs record store on the mds0 (40 llog plain): mds# lctl get_param -n mdd.lustrefs-MDT0000.changelog_users current index: 2474195 ID index (idle seconds) cl1 0 (3509) * Test Access to records at the end of the catalog (offset: 2474194): client# time lfs changelog lustrefs-MDT0000 2474194 >/dev/null * Results - with the patch: real 0m0.592s - without the patch: real 0m17.835s (x30) WC-bug-id: https://jira.whamcloud.com/browse/LU-15481 Lustre-commit: aa22a6826ee521ab1 ("LU-15481 llog: Add LLOG_SKIP_PLAIN to skip llog plain") Signed-off-by: Etienne AUJAMES Reviewed-on: https://review.whamcloud.com/46310 Reviewed-by: Alexander Boyko Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_log.h | 18 +++++++++++++++++- fs/lustre/mdc/mdc_changelog.c | 5 +++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/fs/lustre/include/lustre_log.h b/fs/lustre/include/lustre_log.h index 2e43d56..dbf3fd6 100644 --- a/fs/lustre/include/lustre_log.h +++ b/fs/lustre/include/lustre_log.h @@ -264,7 +264,7 @@ struct llog_ctxt { }; #define LLOG_PROC_BREAK 0x0001 -#define LLOG_DEL_RECORD 0x0002 +#define LLOG_SKIP_PLAIN 0x0004 static inline int llog_handle2ops(struct llog_handle *loghandle, const struct llog_operations **lop) @@ -375,6 +375,22 @@ static inline int llog_next_block(const struct lu_env *env, return rc; } +/* Determine if a llog plain of a catalog could be skiped based on record + * custom indexes. + * This assumes that indexes follow each other. The number of records to skip + * can be computed base on a starting offset and the index of the current + * record (in llog catalog callback). + */ +static inline int llog_is_plain_skipable(struct llog_log_hdr *lh, + struct llog_rec_hdr *rec, + u64 curr, u64 start) +{ + if (start == 0 || curr >= start) + return 0; + + return (LLOG_HDR_BITMAP_SIZE(lh) - rec->lrh_index) < (start - curr); +} + /* llog.c */ int lustre_process_log(struct super_block *sb, char *logname, struct config_llog_instance *cfg); diff --git a/fs/lustre/mdc/mdc_changelog.c b/fs/lustre/mdc/mdc_changelog.c index 36d7fdd..cd2a610 100644 --- a/fs/lustre/mdc/mdc_changelog.c +++ b/fs/lustre/mdc/mdc_changelog.c @@ -225,6 +225,11 @@ static int chlg_read_cat_process_cb(const struct lu_env *env, return rc; } + /* Check if we can skip the entire llog plain */ + if (llog_is_plain_skipable(llh->lgh_hdr, hdr, rec->cr.cr_index, + crs->crs_start_offset)) + return LLOG_SKIP_PLAIN; + /* Skip undesired records */ if (rec->cr.cr_index < crs->crs_start_offset) return 0; From patchwork Thu Aug 4 01:38:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 57D27C19F29 for ; Thu, 4 Aug 2022 01:39:22 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyrxP6xpgz23JV; Wed, 3 Aug 2022 18:39:21 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwb1Z3xz23Jv for ; Wed, 3 Aug 2022 18:38:39 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id D5EB5100B008; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id D49C68D626; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:03 -0400 Message-Id: <1659577097-19253-19-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 18/32] lustre: llite: add projid to debug logs X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Add some minimal debugging on the client to log the projid when it is changed, along with the affected FID. WC-bug-id: https://jira.whamcloud.com/browse/LU-13335 Lustre-commit: 6bceb0030d15b7009 ("LU-13335 ldiskfs: add projid to debug logs") Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/46369 Reviewed-by: Arshad Hussain Reviewed-by: Li Dongyang Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 3aace07..ac20d05 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -3417,6 +3417,8 @@ static int ll_set_project(struct inode *inode, u32 xflags, u32 projid) unsigned int inode_flags; int rc = 0; + CDEBUG(D_QUOTA, DFID" xflags=%x projid=%u\n", + PFID(ll_inode2fid(inode)), xflags, projid); rc = ll_ioctl_check_project(inode, xflags, projid); if (rc) return rc; From patchwork Thu Aug 4 01:38:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935990 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A9A4FC19F2D for ; Thu, 4 Aug 2022 01:39:49 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrxx2fcHz23L5; Wed, 3 Aug 2022 18:39:49 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwb6X3tz23JL for ; Wed, 3 Aug 2022 18:38:39 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id D946F100B009; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id D7BD582CCE; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:04 -0400 Message-Id: <1659577097-19253-20-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 19/32] lnet: asym route inconsistency warning X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Gian-Carlo DeFazio lnet_check_route_inconsistency() checks for inconsistency between the lr_hops and lr_single_hop values of a route. A warning is currently emitted if the route is not single hop and the hop count is either 1 or LNET_UNDEFINED_HOPS. To emit the warning, add the requirement that avoid_asym_router_failure is enabled. WC-bug-id: https://jira.whamcloud.com/browse/LU-14555 Lustre-commit: 6ab060e58e6b3f38b ("LU-14555 lnet: asym route inconsistency warning") Signed-off-by: Gian-Carlo DeFazio Reviewed-on: https://review.whamcloud.com/46918 Reviewed-by: Olaf Faaland-LLNL Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/lnet/router.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c index bbef2b3..b684243 100644 --- a/net/lnet/lnet/router.c +++ b/net/lnet/lnet/router.c @@ -369,7 +369,8 @@ bool lnet_is_route_alive(struct lnet_route *route) lnet_check_route_inconsistency(struct lnet_route *route) { if (!route->lr_single_hop && - (route->lr_hops == 1 || route->lr_hops == LNET_UNDEFINED_HOPS)) { + (route->lr_hops == 1 || route->lr_hops == LNET_UNDEFINED_HOPS) && + avoid_asym_router_failure) { CWARN("route %s->%s is detected to be multi-hop but hop count is set to %d\n", libcfs_net2str(route->lr_net), libcfs_nidstr(&route->lr_gateway->lp_primary_nid), From patchwork Thu Aug 4 01:38:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935991 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C966C19F29 for ; Thu, 4 Aug 2022 01:39:49 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyrxw6ZTGz23L2; Wed, 3 Aug 2022 18:39:48 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwc4NGGz23Jk for ; Wed, 3 Aug 2022 18:38:40 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id DBF55100B02F; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id DADA980795; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:05 -0400 Message-Id: <1659577097-19253-21-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 20/32] lnet: libcfs: debugfs file_operation should have an owner X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown If debugfs a file is open when unloading the libcfs/lnet module, it produces a kernel Oops (debugfs file_operations callbacks no longer exist). Crash generated with routerstat (/sys/kernel/debug/lnet/stats): [ 1449.750396] IP: [] SyS_lseek+0x83/0x100 [ 1449.750412] PGD 9fa14067 PUD 9fa16067 PMD d4e5d067 PTE 0 [ 1449.750428] Oops: 0000 [#1] SMP [ 1449.750883] [] system_call_fastpath+0x25/0x2a [ 1449.750897] [] ? system_call_after_swapgs+0xa2/0x13a This patch adds an owner to debugfs file_operation for libcfs and lnet_router entries (/sys/kernel/debug/lnet/*). The following behavior is expected: $ modprobe lustre $ routerstat 10 > /dev/null & $ lustre_rmmod rmmod: ERROR: Module lnet is in use Can't read statfile (ENODEV) [1]+ Exit 1 routerstat 10 > /dev/null $ lustre_rmmod Note that the allocated 'struct file_operations' cannot be freed until the module_exit() function is called, as files could still be open until then. WC-bug-id: https://jira.whamcloud.com/browse/LU-15759 Lustre-commit: b2dfb4457f0f1e56f ("LU-15759 libcfs: debugfs file_operation should have an owner") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/47335 Reviewed-by: Etienne AUJAMES Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/libcfs/libcfs.h | 4 +++- include/linux/lnet/lib-lnet.h | 1 + net/lnet/libcfs/module.c | 43 ++++++++++++++++++++++++++++++++++++------- net/lnet/lnet/module.c | 1 + net/lnet/lnet/router_proc.c | 10 +++++++++- 5 files changed, 50 insertions(+), 9 deletions(-) diff --git a/include/linux/libcfs/libcfs.h b/include/linux/libcfs/libcfs.h index b59ef9b..e29b007 100644 --- a/include/linux/libcfs/libcfs.h +++ b/include/linux/libcfs/libcfs.h @@ -57,8 +57,10 @@ static inline int notifier_from_ioctl_errno(int err) extern struct workqueue_struct *cfs_rehash_wq; -void lnet_insert_debugfs(struct ctl_table *table); +void lnet_insert_debugfs(struct ctl_table *table, struct module *mod, + void **statep); void lnet_remove_debugfs(struct ctl_table *table); +void lnet_debugfs_fini(void **statep); int debugfs_doint(struct ctl_table *table, int write, void __user *buffer, size_t *lenp, loff_t *ppos); diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index 5a83190..57c8dc2 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -569,6 +569,7 @@ unsigned int lnet_nid_cpt_hash(struct lnet_nid *nid, int lnet_lib_init(void); void lnet_lib_exit(void); +void lnet_router_exit(void); void lnet_mt_event_handler(struct lnet_event *event); diff --git a/net/lnet/libcfs/module.c b/net/lnet/libcfs/module.c index a249bdd..a126683 100644 --- a/net/lnet/libcfs/module.c +++ b/net/lnet/libcfs/module.c @@ -746,19 +746,23 @@ static ssize_t lnet_debugfs_write(struct file *filp, const char __user *buf, .llseek = default_llseek, }; -static const struct file_operations *lnet_debugfs_fops_select(umode_t mode) +static const struct file_operations * +lnet_debugfs_fops_select(umode_t mode, const struct file_operations state[3]) { if (!(mode & 0222)) - return &lnet_debugfs_file_operations_ro; + return &state[0]; if (!(mode & 0444)) - return &lnet_debugfs_file_operations_wo; + return &state[1]; - return &lnet_debugfs_file_operations_rw; + return &state[2]; } -void lnet_insert_debugfs(struct ctl_table *table) +void lnet_insert_debugfs(struct ctl_table *table, struct module *mod, + void **statep) { + struct file_operations *state = *statep; + if (!lnet_debugfs_root) lnet_debugfs_root = debugfs_create_dir("lnet", NULL); @@ -766,6 +770,19 @@ void lnet_insert_debugfs(struct ctl_table *table) if (IS_ERR_OR_NULL(lnet_debugfs_root)) return; + if (!state) { + state = kmalloc(3 * sizeof(*state), GFP_KERNEL); + if (!state) + return; + state[0] = lnet_debugfs_file_operations_ro; + state[0].owner = mod; + state[1] = lnet_debugfs_file_operations_wo; + state[1].owner = mod; + state[2] = lnet_debugfs_file_operations_rw; + state[2].owner = mod; + *statep = state; + } + /* * We don't save the dentry returned because we don't call * debugfs_remove() but rather remove_recursive() @@ -773,10 +790,18 @@ void lnet_insert_debugfs(struct ctl_table *table) for (; table && table->procname; table++) debugfs_create_file(table->procname, table->mode, lnet_debugfs_root, table, - lnet_debugfs_fops_select(table->mode)); + lnet_debugfs_fops_select(table->mode, + (const struct file_operations *)state)); } EXPORT_SYMBOL_GPL(lnet_insert_debugfs); +void lnet_debugfs_fini(void **state) +{ + kfree(*state); + *state = NULL; +} +EXPORT_SYMBOL_GPL(lnet_debugfs_fini); + static void lnet_insert_debugfs_links( const struct lnet_debugfs_symlink_def *symlinks) { @@ -801,6 +826,8 @@ void lnet_remove_debugfs(struct ctl_table *table) static DEFINE_MUTEX(libcfs_startup); static int libcfs_active; +static void *debugfs_state; + int libcfs_setup(void) { int rc = -EINVAL; @@ -855,7 +882,7 @@ static int libcfs_init(void) { int rc; - lnet_insert_debugfs(lnet_table); + lnet_insert_debugfs(lnet_table, THIS_MODULE, &debugfs_state); if (!IS_ERR_OR_NULL(lnet_debugfs_root)) lnet_insert_debugfs_links(lnet_debugfs_symlinks); @@ -875,6 +902,8 @@ static void libcfs_exit(void) debugfs_remove_recursive(lnet_debugfs_root); lnet_debugfs_root = NULL; + lnet_debugfs_fini(&debugfs_state); + if (cfs_rehash_wq) destroy_workqueue(cfs_rehash_wq); diff --git a/net/lnet/lnet/module.c b/net/lnet/lnet/module.c index aba9589..9d7b39a 100644 --- a/net/lnet/lnet/module.c +++ b/net/lnet/lnet/module.c @@ -271,6 +271,7 @@ static void __exit lnet_exit(void) &lnet_ioctl_handler); LASSERT(!rc); + lnet_router_exit(); lnet_lib_exit(); } diff --git a/net/lnet/lnet/router_proc.c b/net/lnet/lnet/router_proc.c index f231da1..689670a 100644 --- a/net/lnet/lnet/router_proc.c +++ b/net/lnet/lnet/router_proc.c @@ -888,12 +888,20 @@ static int proc_lnet_portal_rotor(struct ctl_table *table, int write, } }; +static void *debugfs_state; + void lnet_router_debugfs_init(void) { - lnet_insert_debugfs(lnet_table); + lnet_insert_debugfs(lnet_table, THIS_MODULE, + &debugfs_state); } void lnet_router_debugfs_fini(void) { lnet_remove_debugfs(lnet_table); } + +void lnet_router_exit(void) +{ + lnet_debugfs_fini(&debugfs_state); +} From patchwork Thu Aug 4 01:38:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D3EDCC19F2D for ; Thu, 4 Aug 2022 01:40:03 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyryC1Tgzz23LZ; Wed, 3 Aug 2022 18:40:03 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwd2p51z21Hc for ; Wed, 3 Aug 2022 18:38:41 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id E06CC100B030; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id DE0BD8BBFC; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:06 -0400 Message-Id: <1659577097-19253-22-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 21/32] lustre: client: able to cleanup devices manually X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mikhail Pershin , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mikhail Pershin Using 'lctl cleanup/detach' could be needed in situations with unclean umount. Meanwhile that doesn't work now for LMV and also could cause panic after all Patch restores ability to cleanup/detach client devices manually. - debugfs and lprocfs cleanup in lmv_precleanup() are moved lmv_cleanup() to be not cleared too early. This prevents hang on 'lctl cleanup' for LMV device - test 172 is added in sanity. It skips device cleanup during normal umount, keeping device alive without client mount then manually cleanups/detaches them - prevent negative lov_connections in lov_disconnect() and handle it gracefully - remove obd_cleanup_client_import() in mdc_precleanup(), it is called already inside osc_precleanup_common() WC-bug-id: https://jira.whamcloud.com/browse/LU-15653 Lustre-commit: 210803a2475862464 ("LU-15653 client: able to cleanup devices manually") Signed-off-by: Mikhail Pershin Reviewed-on: https://review.whamcloud.com/46859 Reviewed-by: John L. Hammond Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_support.h | 1 + fs/lustre/llite/llite_lib.c | 5 +++++ fs/lustre/lmv/lmv_obd.c | 14 ++++++-------- fs/lustre/lov/lov_obd.c | 8 +++++++- fs/lustre/mdc/mdc_request.c | 7 +++---- 5 files changed, 22 insertions(+), 13 deletions(-) diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h index b6c8a72..e25d4ed 100644 --- a/fs/lustre/include/obd_support.h +++ b/fs/lustre/include/obd_support.h @@ -381,6 +381,7 @@ #define OBD_FAIL_OBDCLASS_MODULE_LOAD 0x60a #define OBD_FAIL_OBD_ZERO_NLINK_RACE 0x60b #define OBD_FAIL_OBD_SETUP 0x60d +#define OBD_FAIL_OBD_CLEANUP 0x60e #define OBD_FAIL_TGT_REPLY_NET 0x700 #define OBD_FAIL_TGT_CONN_RACE 0x701 diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 5b80722..d947ede 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -1377,10 +1377,15 @@ void ll_put_super(struct super_block *sb) client_common_put_super(sb); } + /* imitate failed cleanup */ + if (OBD_FAIL_CHECK(OBD_FAIL_OBD_CLEANUP)) + goto skip_cleanup; + next = 0; while ((obd = class_devices_in_group(&sbi->ll_sb_uuid, &next))) class_manual_cleanup(obd); +skip_cleanup: if (test_bit(LL_SBI_VERBOSE, sbi->ll_flags)) LCONSOLE_WARN("Unmounted %s\n", profilenm ? profilenm : ""); diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c index 3af7a53..8656d6b 100644 --- a/fs/lustre/lmv/lmv_obd.c +++ b/fs/lustre/lmv/lmv_obd.c @@ -520,7 +520,6 @@ static int lmv_disconnect(struct obd_export *exp) struct obd_device *obd = class_exp2obd(exp); struct lmv_obd *lmv = &obd->u.lmv; struct lmv_tgt_desc *tgt; - int rc; lmv_foreach_connected_tgt(lmv, tgt) lmv_disconnect_mdc(obd, tgt); @@ -528,11 +527,8 @@ static int lmv_disconnect(struct obd_export *exp) if (lmv->lmv_tgts_kobj) kobject_put(lmv->lmv_tgts_kobj); - if (!lmv->connected) - class_export_put(exp); - rc = class_disconnect(exp); lmv->connected = 0; - return rc; + return class_disconnect(exp); } static int lmv_fid2path(struct obd_export *exp, int len, void *karg, @@ -1147,6 +1143,11 @@ static int lmv_cleanup(struct obd_device *obd) struct lu_tgt_desc *tmp; fld_client_fini(&lmv->lmv_fld); + fld_client_debugfs_fini(&lmv->lmv_fld); + + lprocfs_obd_cleanup(obd); + ldebugfs_free_md_stats(obd); + lmv_foreach_tgt_safe(lmv, tgt, tmp) lmv_del_target(lmv, tgt); lu_tgt_descs_fini(&lmv->lmv_mdt_descs); @@ -3063,9 +3064,6 @@ static int lmv_unlink(struct obd_export *exp, struct md_op_data *op_data, static int lmv_precleanup(struct obd_device *obd) { libcfs_kkuc_group_rem(&obd->obd_uuid, 0, KUC_GRP_HSM); - fld_client_debugfs_fini(&obd->u.lmv.lmv_fld); - lprocfs_obd_cleanup(obd); - ldebugfs_free_md_stats(obd); return 0; } diff --git a/fs/lustre/lov/lov_obd.c b/fs/lustre/lov/lov_obd.c index 61159fd..161226f 100644 --- a/fs/lustre/lov/lov_obd.c +++ b/fs/lustre/lov/lov_obd.c @@ -313,8 +313,14 @@ static int lov_disconnect(struct obd_export *exp) goto out; /* Only disconnect the underlying layers on the final disconnect. */ + if (lov->lov_connects == 0) { + CWARN("%s: was disconnected already #%d\n", + obd->obd_name, lov->lov_connects); + return 0; + } + lov->lov_connects--; - if (lov->lov_connects != 0) { + if (lov->lov_connects > 0) { /* why should there be more than 1 connect? */ CWARN("%s: unexpected disconnect #%d\n", obd->obd_name, lov->lov_connects); diff --git a/fs/lustre/mdc/mdc_request.c b/fs/lustre/mdc/mdc_request.c index bb51878..c073da2 100644 --- a/fs/lustre/mdc/mdc_request.c +++ b/fs/lustre/mdc/mdc_request.c @@ -2959,11 +2959,10 @@ static int mdc_precleanup(struct obd_device *obd) osc_precleanup_common(obd); mdc_changelog_cdev_finish(obd); - - obd_cleanup_client_import(obd); - ptlrpc_lprocfs_unregister_obd(obd); - ldebugfs_free_md_stats(obd); mdc_llog_finish(obd); + ldebugfs_free_md_stats(obd); + ptlrpc_lprocfs_unregister_obd(obd); + return 0; } From patchwork Thu Aug 4 01:38:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935993 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 12C47C19F2D for ; Thu, 4 Aug 2022 01:39:54 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyry15G3gz23Jm; Wed, 3 Aug 2022 18:39:53 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwf1CzLz21Hc for ; Wed, 3 Aug 2022 18:38:42 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id E23D1100B031; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id E10638D620; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:07 -0400 Message-Id: <1659577097-19253-23-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 22/32] lustre: lmv: support striped LMVs X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lai Siyao , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Lai Siyao lmv_name_to_stripe_index() should support stripe LMV, which is used by LFSCK to verify name hash. WC-bug-id: https://jira.whamcloud.com/browse/LU-15868 Lustre-commit: 54a2d4662b58e2ba4 ("LU-15868 lfsck: don't crash upon dir migration failure") Signed-off-by: Lai Siyao Reviewed-on: https://review.whamcloud.com/47381 Reviewed-by: Andreas Dilger Reviewed-by: Hongchao Zhang Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_lmv.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/lustre/include/lustre_lmv.h b/fs/lustre/include/lustre_lmv.h index b1d8ed9..cd7cf9e 100644 --- a/fs/lustre/include/lustre_lmv.h +++ b/fs/lustre/include/lustre_lmv.h @@ -366,14 +366,16 @@ static inline u32 crush_hash(u32 a, u32 b) static inline int lmv_name_to_stripe_index(struct lmv_mds_md_v1 *lmv, const char *name, int namelen) { - if (lmv->lmv_magic == LMV_MAGIC_V1) + if (lmv->lmv_magic == LMV_MAGIC_V1 || + lmv->lmv_magic == LMV_MAGIC_STRIPE) return __lmv_name_to_stripe_index(lmv->lmv_hash_type, lmv->lmv_stripe_count, lmv->lmv_migrate_hash, lmv->lmv_migrate_offset, name, namelen, true); - if (lmv->lmv_magic == cpu_to_le32(LMV_MAGIC_V1)) + if (lmv->lmv_magic == cpu_to_le32(LMV_MAGIC_V1) || + lmv->lmv_magic == cpu_to_le32(LMV_MAGIC_STRIPE)) return __lmv_name_to_stripe_index( le32_to_cpu(lmv->lmv_hash_type), le32_to_cpu(lmv->lmv_stripe_count), From patchwork Thu Aug 4 01:38:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935988 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E5E00C19F29 for ; Thu, 4 Aug 2022 01:39:25 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyrxT3TnXz23K5; Wed, 3 Aug 2022 18:39:25 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwf66KLz23Jw for ; Wed, 3 Aug 2022 18:38:42 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id E62D6100B033; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id E44F582CCE; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:08 -0400 Message-Id: <1659577097-19253-24-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 23/32] lnet: o2iblnd: add debug messages for IB X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Cyril Bordage , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Cyril Bordage If net debug is enabled, information about connection, when tx status is ECONNABORTED, is collected (only for IB). WC-bug-id: https://jira.whamcloud.com/browse/LU-15925 Lustre-commit: 9153049bdc7ec8217 ("LU-15925 lnet: add debug messages for IB") Signed-off-by: Cyril Bordage Reviewed-on: https://review.whamcloud.com/47583 Reviewed-by: Frank Sehr Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index 01fa499..d4d8954 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -276,6 +276,13 @@ static int kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type, if (!tx->tx_status) { /* success so far */ if (status < 0) { /* failed? */ + if (status == -ECONNABORTED) { + CDEBUG(D_NET, + "bad status for connection to %s with completion type %x\n", + libcfs_nid2str(conn->ibc_peer->ibp_nid), + txtype); + } + tx->tx_status = status; tx->tx_hstatus = LNET_MSG_STATUS_REMOTE_ERROR; } else if (txtype == IBLND_MSG_GET_REQ) { @@ -812,6 +819,8 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, /* I'm still holding ibc_lock! */ if (conn->ibc_state != IBLND_CONN_ESTABLISHED) { + CDEBUG(D_NET, "connection to %s is not established\n", + conn->ibc_peer ? libcfs_nid2str(conn->ibc_peer->ibp_nid) : "NULL"); rc = -ECONNABORTED; } else if (tx->tx_pool->tpo_pool.po_failed || conn->ibc_hdev != tx->tx_pool->tpo_hdev) { @@ -1153,6 +1162,9 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, LASSERT(conn->ibc_state >= IBLND_CONN_ESTABLISHED); if (conn->ibc_state >= IBLND_CONN_DISCONNECTED) { + CDEBUG(D_NET, "connection with %s is disconnected\n", + conn->ibc_peer ? libcfs_nid2str(conn->ibc_peer->ibp_nid) : "NULL"); + tx->tx_status = -ECONNABORTED; tx->tx_waiting = 0; if (tx->tx_conn) { @@ -2141,10 +2153,12 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, kiblnd_set_conn_state(conn, IBLND_CONN_DISCONNECTED); - /* - * Complete all tx descs not waiting for sends to complete. + /* Complete all tx descs not waiting for sends to complete. * NB we should be safe from RDMA now that the QP has changed state */ + CDEBUG(D_NET, "abort connection with %s\n", + libcfs_nid2str(conn->ibc_peer->ibp_nid)); + kiblnd_abort_txs(conn, &conn->ibc_tx_noops); kiblnd_abort_txs(conn, &conn->ibc_tx_queue); kiblnd_abort_txs(conn, &conn->ibc_tx_queue_rsrvd); From patchwork Thu Aug 4 01:38:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12936001 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A6F8CC19F29 for ; Thu, 4 Aug 2022 01:40:14 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyryQ2Q4Pz23L3; Wed, 3 Aug 2022 18:40:14 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwg4CqKz234T for ; Wed, 3 Aug 2022 18:38:43 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id E8C45100B034; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id E7AD580795; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:09 -0400 Message-Id: <1659577097-19253-25-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 24/32] lnet: o2iblnd: debug message is missing a newline X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Serguei Smirnov , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Serguei Smirnov Add missing newline to one of the debug messages in kiblnd_pool_alloc_node. WC-bug-id: https://jira.whamcloud.com/browse/LU-15984 Lustre-commit: dd670d968a44f0a70 ("LU-15984 o2iblnd: debug message is missing a newline") Signed-off-by: Serguei Smirnov Reviewed-on: https://review.whamcloud.com/47933 Reviewed-by: Frank Sehr Reviewed-by: Cyril Bordage Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c index 65bc89b..ea28c65 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd.c @@ -1887,7 +1887,7 @@ struct list_head *kiblnd_pool_alloc_node(struct kib_poolset *ps) CDEBUG(D_NET, "%s pool exhausted, allocate new pool\n", ps->ps_name); time_before = ktime_get(); rc = ps->ps_pool_create(ps, ps->ps_pool_size, &pool); - CDEBUG(D_NET, "ps_pool_create took %lld ms to complete", + CDEBUG(D_NET, "ps_pool_create took %lld ms to complete\n", ktime_ms_delta(ktime_get(), time_before)); spin_lock(&ps->ps_lock); From patchwork Thu Aug 4 01:38:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 99310C19F2D for ; Thu, 4 Aug 2022 01:39:58 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyry61hWPz23L3; Wed, 3 Aug 2022 18:39:58 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwh2Ktzz23Hf for ; Wed, 3 Aug 2022 18:38:44 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id EC225100B035; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id EAA828D626; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:10 -0400 Message-Id: <1659577097-19253-26-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 25/32] lustre: quota: skip non-exist or inact tgt for lfs_quota X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hongchao Zhang , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Hongchao Zhang The nonexistent or inactive targets (MDC or OSC) should be skipped for quota reporting. WC-bug-id: https://jira.whamcloud.com/browse/LU-14472 Lustre-commit: b54b7ce43929ce7ff ("LU-14472 quota: skip non-exist or inact tgt for lfs_quota") Signed-off-by: Hongchao Zhang Reviewed-on: https://review.whamcloud.com/41771 Reviewed-by: Andreas Dilger Reviewed-by: Feng, Lei Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/dir.c | 8 +++----- fs/lustre/lmv/lmv_obd.c | 15 +++++++++++++-- fs/lustre/lov/lov_obd.c | 13 ++++++++++++- 3 files changed, 28 insertions(+), 8 deletions(-) diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 2b63c48..26c9ec3 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -1215,10 +1215,9 @@ int quotactl_ioctl(struct super_block *sb, struct if_quotactl *qctl) break; } + qctl->qc_cmd = cmd; if (rc) return rc; - - qctl->qc_cmd = cmd; } else { struct obd_quotactl *oqctl; int oqctl_len = sizeof(*oqctl); @@ -2009,10 +2008,9 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) } rc = quotactl_ioctl(inode->i_sb, qctl); - if (rc == 0 && copy_to_user((void __user *)arg, qctl, - sizeof(*qctl))) + if ((rc == 0 || rc == -ENODATA) && + copy_to_user((void __user *)arg, qctl, sizeof(*qctl))) rc = -EFAULT; - out_quotactl: kfree(qctl); return rc; diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c index 8656d6b..6c0eb03 100644 --- a/fs/lustre/lmv/lmv_obd.c +++ b/fs/lustre/lmv/lmv_obd.c @@ -845,6 +845,7 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp, case OBD_IOC_QUOTACTL: { struct if_quotactl *qctl = karg; struct obd_quotactl *oqctl; + struct obd_import *imp; if (qctl->qc_valid == QC_MDTIDX) { tgt = lmv_tgt(lmv, qctl->qc_idx); @@ -863,9 +864,19 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp, return -EINVAL; } - if (!tgt || !tgt->ltd_exp) + if (!tgt) + return -ENODEV; + + if (!tgt->ltd_exp) return -EINVAL; + imp = class_exp2cliimp(tgt->ltd_exp); + if (!tgt->ltd_active && imp->imp_state != LUSTRE_IMP_IDLE) { + qctl->qc_valid = QC_MDTIDX; + qctl->obd_uuid = tgt->ltd_uuid; + return -ENODATA; + } + oqctl = kzalloc(sizeof(*oqctl), GFP_KERNEL); if (!oqctl) return -ENOMEM; @@ -3122,7 +3133,7 @@ static int lmv_get_info(const struct lu_env *env, struct obd_export *exp, exp->exp_connect_data = *(struct obd_connect_data *)val; return rc; } else if (KEY_IS(KEY_TGT_COUNT)) { - *((int *)val) = lmv->lmv_mdt_descs.ltd_lmv_desc.ld_tgt_count; + *((int *)val) = lmv->lmv_mdt_descs.ltd_tgts_size; return 0; } diff --git a/fs/lustre/lov/lov_obd.c b/fs/lustre/lov/lov_obd.c index 161226f..d2fe8c3 100644 --- a/fs/lustre/lov/lov_obd.c +++ b/fs/lustre/lov/lov_obd.c @@ -1021,13 +1021,17 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len, struct if_quotactl *qctl = karg; struct lov_tgt_desc *tgt = NULL; struct obd_quotactl *oqctl; + struct obd_import *imp; if (qctl->qc_valid == QC_OSTIDX) { if (count <= qctl->qc_idx) return -EINVAL; tgt = lov->lov_tgts[qctl->qc_idx]; - if (!tgt || !tgt->ltd_exp) + if (!tgt) + return -ENODEV; + + if (!tgt->ltd_exp) return -EINVAL; } else if (qctl->qc_valid == QC_UUID) { for (i = 0; i < count; i++) { @@ -1050,6 +1054,13 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len, return -EAGAIN; LASSERT(tgt && tgt->ltd_exp); + imp = class_exp2cliimp(tgt->ltd_exp); + if (!tgt->ltd_active && imp->imp_state != LUSTRE_IMP_IDLE) { + qctl->qc_valid = QC_OSTIDX; + qctl->obd_uuid = tgt->ltd_uuid; + return -ENODATA; + } + oqctl = kzalloc(sizeof(*oqctl), GFP_NOFS); if (!oqctl) return -ENOMEM; From patchwork Thu Aug 4 01:38:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935992 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8C5A3C19F29 for ; Thu, 4 Aug 2022 01:39:53 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyry109r8z23Lc; Wed, 3 Aug 2022 18:39:53 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwj0k9gz21H0 for ; Wed, 3 Aug 2022 18:38:45 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id EF2AF100B036; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id EDA878BBFC; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:11 -0400 Message-Id: <1659577097-19253-27-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 26/32] lustre: mdc: pack default LMV in open reply X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lai Siyao , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Lai Siyao Add flag MDS_OPEN_DEFAULT_LMV to indicate that default LMV should be packed in open reply, otherwise if open fetches LOOKUP lock, client won't know directory has default LMV, and in subdir creation default LMV won't take effect. WC-bug-id: https://jira.whamcloud.com/browse/LU-15850 Lustre-commit: f6e4272fb0be5b798 ("LU-15850 mdt: pack default LMV in open reply") Signed-off-by: Lai Siyao Reviewed-on: https://review.whamcloud.com/47576 Reviewed-by: Andreas Dilger Reviewed-by: Hongchao Zhang Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/mdc/mdc_lib.c | 1 + fs/lustre/mdc/mdc_locks.c | 2 ++ fs/lustre/ptlrpc/layout.c | 1 + fs/lustre/ptlrpc/wiretest.c | 2 ++ include/uapi/linux/lustre/lustre_user.h | 5 ++++- 5 files changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/lustre/mdc/mdc_lib.c b/fs/lustre/mdc/mdc_lib.c index 51080a1..077639d 100644 --- a/fs/lustre/mdc/mdc_lib.c +++ b/fs/lustre/mdc/mdc_lib.c @@ -329,6 +329,7 @@ void mdc_open_pack(struct req_capsule *pill, struct md_op_data *op_data, rec->cr_archive_id = op_data->op_archive_id; } } + cr_flags |= MDS_OPEN_DEFAULT_LMV; set_mrc_cr_flags(rec, cr_flags); } diff --git a/fs/lustre/mdc/mdc_locks.c b/fs/lustre/mdc/mdc_locks.c index b86d1b9..2a9b9a8 100644 --- a/fs/lustre/mdc/mdc_locks.c +++ b/fs/lustre/mdc/mdc_locks.c @@ -393,6 +393,8 @@ static int mdc_save_lovea(struct ptlrpc_request *req, void *data, u32 size) */ req_capsule_set_size(&req->rq_pill, &RMF_NIOBUF_INLINE, RCL_SERVER, sizeof(struct niobuf_remote)); + req_capsule_set_size(&req->rq_pill, &RMF_DEFAULT_MDT_MD, RCL_SERVER, + sizeof(struct lmv_user_md)); ptlrpc_request_set_replen(req); /* Get real repbuf allocated size as rounded up power of 2 */ diff --git a/fs/lustre/ptlrpc/layout.c b/fs/lustre/ptlrpc/layout.c index 8725edd..82ec899 100644 --- a/fs/lustre/ptlrpc/layout.c +++ b/fs/lustre/ptlrpc/layout.c @@ -447,6 +447,7 @@ &RMF_NIOBUF_INLINE, &RMF_FILE_SECCTX, &RMF_FILE_ENCCTX, + &RMF_DEFAULT_MDT_MD, }; static const struct req_msg_field *ldlm_intent_getattr_client[] = { diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c index 81e0485..60a7fd0 100644 --- a/fs/lustre/ptlrpc/wiretest.c +++ b/fs/lustre/ptlrpc/wiretest.c @@ -2326,6 +2326,8 @@ void lustre_assert_wire_constants(void) (long long)MDS_OPEN_RESYNC); LASSERTF(MDS_OPEN_PCC == 00000000010000000000000ULL, "found 0%.22lloULL\n", (long long)MDS_OPEN_PCC); + LASSERTF(MDS_OPEN_DEFAULT_LMV == 00000000040000000000000ULL, "found 0%.22lloULL\n", + (long long)MDS_OPEN_DEFAULT_LMV); LASSERTF(LUSTRE_SYNC_FL == 0x00000008, "found 0x%.8x\n", LUSTRE_SYNC_FL); LASSERTF(LUSTRE_IMMUTABLE_FL == 0x00000010, "found 0x%.8x\n", diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h index c57929b..7b79604 100644 --- a/include/uapi/linux/lustre/lustre_user.h +++ b/include/uapi/linux/lustre/lustre_user.h @@ -1246,12 +1246,15 @@ enum la_valid { * for newly created file */ #define MDS_OP_WITH_FID 020000000000000ULL /* operation carried out by FID */ +#define MDS_OPEN_DEFAULT_LMV 040000000000000ULL /* open fetches default LMV */ +/* lustre internal open flags, which should not be set from user space */ #define MDS_OPEN_FL_INTERNAL (MDS_OPEN_HAS_EA | MDS_OPEN_HAS_OBJS | \ MDS_OPEN_OWNEROVERRIDE | MDS_OPEN_LOCK | \ MDS_OPEN_BY_FID | MDS_OPEN_LEASE | \ MDS_OPEN_RELEASE | MDS_OPEN_RESYNC | \ - MDS_OPEN_PCC | MDS_OP_WITH_FID) + MDS_OPEN_PCC | MDS_OP_WITH_FID | \ + MDS_OPEN_DEFAULT_LMV) /********* Changelogs **********/ /** Changelog record types */ From patchwork Thu Aug 4 01:38:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935994 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 13E6CC19F29 for ; Thu, 4 Aug 2022 01:39:58 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyry53lzRz23K5; Wed, 3 Aug 2022 18:39:57 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwj6Jdkz23Jr for ; Wed, 3 Aug 2022 18:38:45 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id F3414100B053; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id F0AAF8D620; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:12 -0400 Message-Id: <1659577097-19253-28-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 27/32] lnet: Define KFILND network type X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn Define the KFILND network type. This reserves the network type number for future implementation and allows creation of kfi peers and adding routes to kfi peers. HPE-bug-id: LUS-11060 WC-bug-id: https://jira.whamcloud.com/browse/LU-15983 Lustre-commit: 5fea36c952373c9a2 ("LU-15983 lnet: Define KFILND network type") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/47830 Reviewed-by: James Simmons Reviewed-by: Cyril Bordage Reviewed-by: Frank Sehr Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/uapi/linux/lnet/nidstr.h | 1 + net/lnet/lnet/nidstrings.c | 10 ++++++++++ 2 files changed, 11 insertions(+) diff --git a/include/uapi/linux/lnet/nidstr.h b/include/uapi/linux/lnet/nidstr.h index 80be2eb..d5829fe 100644 --- a/include/uapi/linux/lnet/nidstr.h +++ b/include/uapi/linux/lnet/nidstr.h @@ -55,6 +55,7 @@ enum { GNILND = 13, GNIIPLND = 14, PTL4LND = 15, + KFILND = 16, NUM_LNDS }; diff --git a/net/lnet/lnet/nidstrings.c b/net/lnet/lnet/nidstrings.c index 3523b78..ac2aa97 100644 --- a/net/lnet/lnet/nidstrings.c +++ b/net/lnet/lnet/nidstrings.c @@ -758,6 +758,16 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist) .nf_print_addrlist = libcfs_num_addr_range_print, .nf_match_addr = libcfs_num_match }, + { + .nf_type = KFILND, + .nf_name = "kfi", + .nf_modname = "kkfilnd", + .nf_addr2str = libcfs_decnum_addr2str, + .nf_str2addr = libcfs_num_str2addr, + .nf_parse_addrlist = libcfs_num_parse, + .nf_print_addrlist = libcfs_num_addr_range_print, + .nf_match_addr = libcfs_num_match + }, }; static const size_t libcfs_nnetstrfns = ARRAY_SIZE(libcfs_netstrfns); From patchwork Thu Aug 4 01:38:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12936002 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DB105C19F2D for ; Thu, 4 Aug 2022 01:40:26 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyryf3r2jz23KM; Wed, 3 Aug 2022 18:40:26 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwk488wz23J7 for ; Wed, 3 Aug 2022 18:38:46 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 028AD100B054; Wed, 3 Aug 2022 21:38:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id F3A9B82CCE; Wed, 3 Aug 2022 21:38:23 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:13 -0400 Message-Id: <1659577097-19253-29-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 28/32] lnet: Adjust niov checks for large MD X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn An LNet user can allocate a large contiguous MD. That MD can have > LNET_MAX_IOV pages which causes some LNDs to assert on either niov argument passed to lnd_recv() or the value stored in lnet_msg::msg_niov. This is true even in cases where the actual transfer size is <= LNET_MTU and will not exceed limits in the LNDs. Adjust ksocklnd_send()/ksocklnd_recv() to assert on the return value of lnet_extract_kiov(). Remove the assert on msg_niov (payload_niov) from kiblnd_send(). kiblnd_setup_rd_kiov() will already fail if we exceed ko2iblnd's available scatter gather entries. HPE-bug-id: LUS-10878 Fixes: 05cd1717bb ("lnet: always put a page list into struct lnet_libmd") WC-bug-id: https://jira.whamcloud.com/browse/LU-15851 Lustre-commit: 105193b4a147257a0 ("LU-15851 lnet: Adjust niov checks for large MD") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/47319 Reviewed-by: Shaun Tancheff Reviewed-by: Serguei Smirnov Reviewed-by: Alexey Lyashkov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 1 - net/lnet/klnds/socklnd/socklnd_cb.c | 5 +++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index d4d8954..30e77c0 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -1564,7 +1564,6 @@ static int kiblnd_map_tx(struct lnet_ni *ni, struct kib_tx *tx, payload_nob, payload_niov, libcfs_idstr(target)); LASSERT(!payload_nob || payload_niov > 0); - LASSERT(payload_niov <= LNET_MAX_IOV); /* Thread context */ LASSERT(!in_interrupt()); diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c index 94600f3..308d8b0 100644 --- a/net/lnet/klnds/socklnd/socklnd_cb.c +++ b/net/lnet/klnds/socklnd/socklnd_cb.c @@ -936,7 +936,6 @@ struct ksock_conn_cb * payload_nob, payload_niov, libcfs_idstr(target)); LASSERT(!payload_nob || payload_niov > 0); - LASSERT(payload_niov <= LNET_MAX_IOV); LASSERT(!in_interrupt()); desc_size = offsetof(struct ksock_tx, @@ -962,6 +961,8 @@ struct ksock_conn_cb * payload_niov, payload_kiov, payload_offset, payload_nob); + LASSERT(tx->tx_nkiov <= LNET_MAX_IOV); + if (payload_nob >= *ksocknal_tunables.ksnd_zc_min_payload) tx->tx_zc_capable = 1; @@ -1278,13 +1279,13 @@ struct ksock_conn_cb * struct ksock_sched *sched = conn->ksnc_scheduler; LASSERT(iov_iter_count(to) <= rlen); - LASSERT(to->nr_segs <= LNET_MAX_IOV); conn->ksnc_lnet_msg = msg; conn->ksnc_rx_nob_left = rlen; conn->ksnc_rx_to = *to; + LASSERT(conn->ksnc_rx_to.nr_segs <= LNET_MAX_IOV); LASSERT(conn->ksnc_rx_scheduled); spin_lock_bh(&sched->kss_lock); From patchwork Thu Aug 4 01:38:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12936003 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15A8EC19F29 for ; Thu, 4 Aug 2022 01:40:26 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Lyryd4tS7z23Hq; Wed, 3 Aug 2022 18:40:25 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwl1zQvz23Hx for ; Wed, 3 Aug 2022 18:38:47 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 05A84100B055; Wed, 3 Aug 2022 21:38:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 02DD180795; Wed, 3 Aug 2022 21:38:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:14 -0400 Message-Id: <1659577097-19253-30-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 29/32] lustre: ec: code to add support for M to N parity X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Adam Disney , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" This code adds basic functionality for calculating N parities for M data units. This allows much more than just working with raid6 calculations. The code is derived from the Intel isa-l userland library. Keep the code in an separate module for easy merger upstream at a latter time. WC-bug-id: https://jira.whamcloud.com/browse/LU-12189 Lustre-commit: 047347170b8aece43 ("LU-12189 ec: code to add support for M to N parity") Signed-off-by: James Simmons Signed-off-by: Adam Disney Reviewed-on: https://review.whamcloud.com/47628 Reviewed-by: Andreas Dilger Reviewed-by: Patrick Farrell Reviewed-by: Oleg Drokin --- fs/lustre/Makefile | 2 +- fs/lustre/ec/Makefile | 4 + fs/lustre/ec/ec_base.c | 352 +++++++++++++++++++++++++++++++++++++++ fs/lustre/include/erasure_code.h | 142 ++++++++++++++++ 4 files changed, 499 insertions(+), 1 deletion(-) create mode 100644 fs/lustre/ec/Makefile create mode 100644 fs/lustre/ec/ec_base.c create mode 100644 fs/lustre/include/erasure_code.h diff --git a/fs/lustre/Makefile b/fs/lustre/Makefile index 4a02463..4af6ff6 100644 --- a/fs/lustre/Makefile +++ b/fs/lustre/Makefile @@ -1,3 +1,3 @@ -obj-$(CONFIG_LUSTRE_FS) += obdclass/ ptlrpc/ fld/ osc/ mgc/ \ +obj-$(CONFIG_LUSTRE_FS) += obdclass/ ptlrpc/ fld/ osc/ mgc/ ec/ \ fid/ lov/ mdc/ lmv/ llite/ obdecho/ diff --git a/fs/lustre/ec/Makefile b/fs/lustre/ec/Makefile new file mode 100644 index 0000000..aba8ea3 --- /dev/null +++ b/fs/lustre/ec/Makefile @@ -0,0 +1,4 @@ +ccflags-y += -I$(srctree)/$(src)/../include + +obj-$(CONFIG_LUSTRE_FS) += ec.o +ec-y := ec_base.o diff --git a/fs/lustre/ec/ec_base.c b/fs/lustre/ec/ec_base.c new file mode 100644 index 0000000..e520466 --- /dev/null +++ b/fs/lustre/ec/ec_base.c @@ -0,0 +1,352 @@ +// SPDX-License-Identifier: BSD-2-Clause +/********************************************************************** + * Copyright(c) 2011-2015 Intel Corporation All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include /* for memset */ + +#include "erasure_code.h" + +/* Global GF(256) tables */ +static const unsigned char gff_base[] = { + 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1d, 0x3a, + 0x74, 0xe8, 0xcd, 0x87, 0x13, 0x26, 0x4c, 0x98, 0x2d, 0x5a, + 0xb4, 0x75, 0xea, 0xc9, 0x8f, 0x03, 0x06, 0x0c, 0x18, 0x30, + 0x60, 0xc0, 0x9d, 0x27, 0x4e, 0x9c, 0x25, 0x4a, 0x94, 0x35, + 0x6a, 0xd4, 0xb5, 0x77, 0xee, 0xc1, 0x9f, 0x23, 0x46, 0x8c, + 0x05, 0x0a, 0x14, 0x28, 0x50, 0xa0, 0x5d, 0xba, 0x69, 0xd2, + 0xb9, 0x6f, 0xde, 0xa1, 0x5f, 0xbe, 0x61, 0xc2, 0x99, 0x2f, + 0x5e, 0xbc, 0x65, 0xca, 0x89, 0x0f, 0x1e, 0x3c, 0x78, 0xf0, + 0xfd, 0xe7, 0xd3, 0xbb, 0x6b, 0xd6, 0xb1, 0x7f, 0xfe, 0xe1, + 0xdf, 0xa3, 0x5b, 0xb6, 0x71, 0xe2, 0xd9, 0xaf, 0x43, 0x86, + 0x11, 0x22, 0x44, 0x88, 0x0d, 0x1a, 0x34, 0x68, 0xd0, 0xbd, + 0x67, 0xce, 0x81, 0x1f, 0x3e, 0x7c, 0xf8, 0xed, 0xc7, 0x93, + 0x3b, 0x76, 0xec, 0xc5, 0x97, 0x33, 0x66, 0xcc, 0x85, 0x17, + 0x2e, 0x5c, 0xb8, 0x6d, 0xda, 0xa9, 0x4f, 0x9e, 0x21, 0x42, + 0x84, 0x15, 0x2a, 0x54, 0xa8, 0x4d, 0x9a, 0x29, 0x52, 0xa4, + 0x55, 0xaa, 0x49, 0x92, 0x39, 0x72, 0xe4, 0xd5, 0xb7, 0x73, + 0xe6, 0xd1, 0xbf, 0x63, 0xc6, 0x91, 0x3f, 0x7e, 0xfc, 0xe5, + 0xd7, 0xb3, 0x7b, 0xf6, 0xf1, 0xff, 0xe3, 0xdb, 0xab, 0x4b, + 0x96, 0x31, 0x62, 0xc4, 0x95, 0x37, 0x6e, 0xdc, 0xa5, 0x57, + 0xae, 0x41, 0x82, 0x19, 0x32, 0x64, 0xc8, 0x8d, 0x07, 0x0e, + 0x1c, 0x38, 0x70, 0xe0, 0xdd, 0xa7, 0x53, 0xa6, 0x51, 0xa2, + 0x59, 0xb2, 0x79, 0xf2, 0xf9, 0xef, 0xc3, 0x9b, 0x2b, 0x56, + 0xac, 0x45, 0x8a, 0x09, 0x12, 0x24, 0x48, 0x90, 0x3d, 0x7a, + 0xf4, 0xf5, 0xf7, 0xf3, 0xfb, 0xeb, 0xcb, 0x8b, 0x0b, 0x16, + 0x2c, 0x58, 0xb0, 0x7d, 0xfa, 0xe9, 0xcf, 0x83, 0x1b, 0x36, + 0x6c, 0xd8, 0xad, 0x47, 0x8e, 0x01 +}; + +static const unsigned char gflog_base[] = { + 0x00, 0xff, 0x01, 0x19, 0x02, 0x32, 0x1a, 0xc6, 0x03, 0xdf, + 0x33, 0xee, 0x1b, 0x68, 0xc7, 0x4b, 0x04, 0x64, 0xe0, 0x0e, + 0x34, 0x8d, 0xef, 0x81, 0x1c, 0xc1, 0x69, 0xf8, 0xc8, 0x08, + 0x4c, 0x71, 0x05, 0x8a, 0x65, 0x2f, 0xe1, 0x24, 0x0f, 0x21, + 0x35, 0x93, 0x8e, 0xda, 0xf0, 0x12, 0x82, 0x45, 0x1d, 0xb5, + 0xc2, 0x7d, 0x6a, 0x27, 0xf9, 0xb9, 0xc9, 0x9a, 0x09, 0x78, + 0x4d, 0xe4, 0x72, 0xa6, 0x06, 0xbf, 0x8b, 0x62, 0x66, 0xdd, + 0x30, 0xfd, 0xe2, 0x98, 0x25, 0xb3, 0x10, 0x91, 0x22, 0x88, + 0x36, 0xd0, 0x94, 0xce, 0x8f, 0x96, 0xdb, 0xbd, 0xf1, 0xd2, + 0x13, 0x5c, 0x83, 0x38, 0x46, 0x40, 0x1e, 0x42, 0xb6, 0xa3, + 0xc3, 0x48, 0x7e, 0x6e, 0x6b, 0x3a, 0x28, 0x54, 0xfa, 0x85, + 0xba, 0x3d, 0xca, 0x5e, 0x9b, 0x9f, 0x0a, 0x15, 0x79, 0x2b, + 0x4e, 0xd4, 0xe5, 0xac, 0x73, 0xf3, 0xa7, 0x57, 0x07, 0x70, + 0xc0, 0xf7, 0x8c, 0x80, 0x63, 0x0d, 0x67, 0x4a, 0xde, 0xed, + 0x31, 0xc5, 0xfe, 0x18, 0xe3, 0xa5, 0x99, 0x77, 0x26, 0xb8, + 0xb4, 0x7c, 0x11, 0x44, 0x92, 0xd9, 0x23, 0x20, 0x89, 0x2e, + 0x37, 0x3f, 0xd1, 0x5b, 0x95, 0xbc, 0xcf, 0xcd, 0x90, 0x87, + 0x97, 0xb2, 0xdc, 0xfc, 0xbe, 0x61, 0xf2, 0x56, 0xd3, 0xab, + 0x14, 0x2a, 0x5d, 0x9e, 0x84, 0x3c, 0x39, 0x53, 0x47, 0x6d, + 0x41, 0xa2, 0x1f, 0x2d, 0x43, 0xd8, 0xb7, 0x7b, 0xa4, 0x76, + 0xc4, 0x17, 0x49, 0xec, 0x7f, 0x0c, 0x6f, 0xf6, 0x6c, 0xa1, + 0x3b, 0x52, 0x29, 0x9d, 0x55, 0xaa, 0xfb, 0x60, 0x86, 0xb1, + 0xbb, 0xcc, 0x3e, 0x5a, 0xcb, 0x59, 0x5f, 0xb0, 0x9c, 0xa9, + 0xa0, 0x51, 0x0b, 0xf5, 0x16, 0xeb, 0x7a, 0x75, 0x2c, 0xd7, + 0x4f, 0xae, 0xd5, 0xe9, 0xe6, 0xe7, 0xad, 0xe8, 0x74, 0xd6, + 0xf4, 0xea, 0xa8, 0x50, 0x58, 0xaf +}; + +void ec_init_tables(int k, int rows, unsigned char *a, unsigned char *g_tbls) +{ + int i, j; + + for (i = 0; i < rows; i++) { + for (j = 0; j < k; j++) { + gf_vect_mul_init(*a++, g_tbls); + g_tbls += 32; + } + } +} +EXPORT_SYMBOL(ec_init_tables); + +unsigned char gf_mul(unsigned char a, unsigned char b) +{ + int i; + + if ((a == 0) || (b == 0)) + return 0; + + i = gflog_base[a] + gflog_base[b]; + return gff_base[i > 254 ? i - 255 : i]; +} + +unsigned char gf_inv(unsigned char a) +{ + if (a == 0) + return 0; + + return gff_base[255 - gflog_base[a]]; +} + +void gf_gen_cauchy1_matrix(unsigned char *a, int m, int k) +{ + int i, j; + unsigned char *p; + + /* Identity matrix in high position */ + memset(a, 0, k * m); + for (i = 0; i < k; i++) + a[k * i + i] = 1; + + /* For the rest choose 1/(i + j) | i != j */ + p = &a[k * k]; + for (i = k; i < m; i++) + for (j = 0; j < k; j++) + *p++ = gf_inv(i ^ j); + +} +EXPORT_SYMBOL(gf_gen_cauchy1_matrix); + +int gf_invert_matrix(unsigned char *in_mat, unsigned char *out_mat, const int n) +{ + int i, j, k; + unsigned char temp; + + /* Set out_mat[] to the identity matrix */ + for (i = 0; i < n * n; i++) /* memset(out_mat, 0, n*n) */ + out_mat[i] = 0; + + for (i = 0; i < n; i++) + out_mat[i * n + i] = 1; + + /* Inverse */ + for (i = 0; i < n; i++) { + /* Check for 0 in pivot element */ + if (in_mat[i * n + i] == 0) { + /* Find a row with non-zero in current column and swap */ + for (j = i + 1; j < n; j++) + if (in_mat[j * n + i]) + break; + + if (j == n) /* Couldn't find means it's singular */ + return -1; + + for (k = 0; k < n; k++) { /* Swap rows i,j */ + temp = in_mat[i * n + k]; + in_mat[i * n + k] = in_mat[j * n + k]; + in_mat[j * n + k] = temp; + + temp = out_mat[i * n + k]; + out_mat[i * n + k] = out_mat[j * n + k]; + out_mat[j * n + k] = temp; + } + } + + temp = gf_inv(in_mat[i * n + i]); /* 1/pivot */ + for (j = 0; j < n; j++) { /* Scale row i by 1/pivot */ + in_mat[i * n + j] = gf_mul(in_mat[i * n + j], temp); + out_mat[i * n + j] = gf_mul(out_mat[i * n + j], temp); + } + + for (j = 0; j < n; j++) { + if (j == i) + continue; + + temp = in_mat[j * n + i]; + for (k = 0; k < n; k++) { + out_mat[j * n + k] ^= gf_mul(temp, out_mat[i * n + k]); + in_mat[j * n + k] ^= gf_mul(temp, in_mat[i * n + k]); + } + } + } + return 0; +} +EXPORT_SYMBOL(gf_invert_matrix); + +/* Calculates const table gftbl in GF(2^8) from single input A + * gftbl(A) = {A{00}, A{01}, A{02}, ... , A{0f} }, {A{00}, A{10}, A{20}, ... , + * A{f0} } + */ +void gf_vect_mul_init(unsigned char c, unsigned char *tbl) +{ + unsigned char c2 = (c << 1) ^ ((c & 0x80) ? 0x1d : 0); /* Mult by + * GF{2} + */ + unsigned char c4 = (c2 << 1) ^ ((c2 & 0x80) ? 0x1d : 0); /* Mult by + * GF{2} + */ + unsigned char c8 = (c4 << 1) ^ ((c4 & 0x80) ? 0x1d : 0); /* Mult by + * GF{2} + */ +#if BITS_PER_LONG == 64 + unsigned long long v1, v2, v4, v8, *t; + unsigned long long v10, v20, v40, v80; + unsigned char c17, c18, c20, c24; + + t = (unsigned long long *)tbl; + + v1 = c * 0x0100010001000100ull; + v2 = c2 * 0x0101000001010000ull; + v4 = c4 * 0x0101010100000000ull; + v8 = c8 * 0x0101010101010101ull; + + v4 = v1 ^ v2 ^ v4; + t[0] = v4; + t[1] = v8 ^ v4; + + c17 = (c8 << 1) ^ ((c8 & 0x80) ? 0x1d : 0); //Mult by GF{2} + c18 = (c17 << 1) ^ ((c17 & 0x80) ? 0x1d : 0); //Mult by GF{2} + c20 = (c18 << 1) ^ ((c18 & 0x80) ? 0x1d : 0); //Mult by GF{2} + c24 = (c20 << 1) ^ ((c20 & 0x80) ? 0x1d : 0); //Mult by GF{2} + + v10 = c17 * 0x0100010001000100ull; + v20 = c18 * 0x0101000001010000ull; + v40 = c20 * 0x0101010100000000ull; + v80 = c24 * 0x0101010101010101ull; + + v40 = v10 ^ v20 ^ v40; + t[2] = v40; + t[3] = v80 ^ v40; + +#else /* 32-bit or other */ + unsigned char c3, c5, c6, c7, c9, c10, c11, c12, c13, c14, c15; + unsigned char c17, c18, c19, c20, c21, c22, c23, c24, c25, c26; + unsigned char c27, c28, c29, c30, c31; + + c3 = c2 ^ c; + c5 = c4 ^ c; + c6 = c4 ^ c2; + c7 = c4 ^ c3; + + c9 = c8 ^ c; + c10 = c8 ^ c2; + c11 = c8 ^ c3; + c12 = c8 ^ c4; + c13 = c8 ^ c5; + c14 = c8 ^ c6; + c15 = c8 ^ c7; + + tbl[0] = 0; + tbl[1] = c; + tbl[2] = c2; + tbl[3] = c3; + tbl[4] = c4; + tbl[5] = c5; + tbl[6] = c6; + tbl[7] = c7; + tbl[8] = c8; + tbl[9] = c9; + tbl[10] = c10; + tbl[11] = c11; + tbl[12] = c12; + tbl[13] = c13; + tbl[14] = c14; + tbl[15] = c15; + + c17 = (c8 << 1) ^ ((c8 & 0x80) ? 0x1d : 0); /* Mult by GF{2} */ + c18 = (c17 << 1) ^ ((c17 & 0x80) ? 0x1d : 0); /* Mult by GF{2} */ + c19 = c18 ^ c17; + c20 = (c18 << 1) ^ ((c18 & 0x80) ? 0x1d : 0); /* Mult by GF{2} */ + c21 = c20 ^ c17; + c22 = c20 ^ c18; + c23 = c20 ^ c19; + c24 = (c20 << 1) ^ ((c20 & 0x80) ? 0x1d : 0); /* Mult by GF{2} */ + c25 = c24 ^ c17; + c26 = c24 ^ c18; + c27 = c24 ^ c19; + c28 = c24 ^ c20; + c29 = c24 ^ c21; + c30 = c24 ^ c22; + c31 = c24 ^ c23; + + tbl[16] = 0; + tbl[17] = c17; + tbl[18] = c18; + tbl[19] = c19; + tbl[20] = c20; + tbl[21] = c21; + tbl[22] = c22; + tbl[23] = c23; + tbl[24] = c24; + tbl[25] = c25; + tbl[26] = c26; + tbl[27] = c27; + tbl[28] = c28; + tbl[29] = c29; + tbl[30] = c30; + tbl[31] = c31; + +#endif /* BITS_PER_LONG == 64 */ +} + +void ec_encode_data(int len, int srcs, int dests, unsigned char *v, + unsigned char **src, unsigned char **dest) +{ + int i, j, l; + unsigned char s; + + for (l = 0; l < dests; l++) { + for (i = 0; i < len; i++) { + s = 0; + for (j = 0; j < srcs; j++) + s ^= gf_mul(src[j][i], v[j * 32 + l * srcs * 32 + 1]); + + dest[l][i] = s; + } + } +} +EXPORT_SYMBOL(ec_encode_data); + +static int __init ec_init(void) +{ + return 0; +} + +static void __exit ec_exit(void) +{ +} + +MODULE_AUTHOR("Intel Corporation"); +MODULE_DESCRIPTION("M to N erasure code handling"); +MODULE_VERSION("1.0.0"); +MODULE_LICENSE("Dual BSD/GPL"); + +module_init(ec_init); +module_exit(ec_exit); diff --git a/fs/lustre/include/erasure_code.h b/fs/lustre/include/erasure_code.h new file mode 100644 index 0000000..9e62c2b --- /dev/null +++ b/fs/lustre/include/erasure_code.h @@ -0,0 +1,142 @@ +// SPDX-License-Identifier: BSD-2-Clause +/********************************************************************** + * Copyright(c) 2011-2015 Intel Corporation All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _ERASURE_CODE_H_ +#define _ERASURE_CODE_H_ + +/** + * @file erasure_code.h + * @brief Interface to functions supporting erasure code encode and decode. + * + * This file defines the interface to optimized functions used in erasure + * codes. Encode and decode of erasures in GF(2^8) are made by calculating the + * dot product of the symbols (bytes in GF(2^8)) across a set of buffers and a + * set of coefficients. Values for the coefficients are determined by the type + * of erasure code. Using a general dot product means that any sequence of + * coefficients may be used including erasure codes based on random + * coefficients. + * Multiple versions of dot product are supplied to calculate 1-6 output + * vectors in one pass. + * Base GF multiply and divide functions can be sped up by defining + * GF_LARGE_TABLES at the expense of memory size. + * + */ + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * @brief Initialize 32-byte constant array for GF(2^8) vector multiply + * + * Calculates array {C{00}, C{01}, C{02}, ... , C{0f} }, {C{00}, C{10}, + * C{20}, ... , C{f0} } as required by other fast vector multiply + * functions. + * @param c Constant input. + * @param gftbl Table output. + */ +void gf_vect_mul_init(unsigned char c, unsigned char *gftbl); + +/** + * @brief Initialize tables for fast Erasure Code encode and decode. + * + * Generates the expanded tables needed for fast encode or decode for erasure + * codes on blocks of data. 32bytes is generated for each input coefficient. + * + * @param k The number of vector sources or rows in the generator matrix + * for coding. + * @param rows The number of output vectors to concurrently encode/decode. + * @param a Pointer to sets of arrays of input coefficients used to encode + * or decode data. + * @param gftbls Pointer to start of space for concatenated output tables + * generated from input coefficients. Must be of size 32*k*rows. + * @returns none + */ +void ec_init_tables(int k, int rows, unsigned char *a, unsigned char *gftbls); + +/** + * @brief Generate or decode erasure codes on blocks of data, runs appropriate + * version. + * + * Given a list of source data blocks, generate one or multiple blocks of + * encoded data as specified by a matrix of GF(2^8) coefficients. When given a + * suitable set of coefficients, this function will perform the fast generation + * or decoding of Reed-Solomon type erasure codes. + * + * This function determines what instruction sets are enabled and + * selects the appropriate version at runtime. + * + * @param len Length of each block of data (vector) of source or dest data. + * @param k The number of vector sources or rows in the generator matrix + * for coding. + * @param rows The number of output vectors to concurrently encode/decode. + * @param gftbls Pointer to array of input tables generated from coding + * coefficients in ec_init_tables(). Must be of size 32*k*rows + * @param data Array of pointers to source input buffers. + * @param coding Array of pointers to coded output buffers. + * @returns none + */ +void ec_encode_data(int len, int k, int rows, unsigned char *gftbls, + unsigned char **data, unsigned char **coding); + +/** + * @brief Generate a Cauchy matrix of coefficients to be used for encoding. + * + * Cauchy matrix example of encoding coefficients where high portion of matrix + * is identity matrix I and lower portion is constructed as 1/(i + j) | i != j, + * i:{0,k-1} j:{k,m-1}. Any sub-matrix of a Cauchy matrix should be invertable. + * + * @param a [m x k] array to hold coefficients + * @param m number of rows in matrix corresponding to srcs + parity. + * @param k number of columns in matrix corresponding to srcs. + * @returns none + */ +void gf_gen_cauchy1_matrix(unsigned char *a, int m, int k); + +/** + * @brief Invert a matrix in GF(2^8) + * + * Attempts to construct an n x n inverse of the input matrix. Returns non-zero + * if singular. Will always destroy input matrix in process. + * + * @param in input matrix, destroyed by invert process + * @param out output matrix such that [in] x [out] = [I] - identity matrix + * @param n size of matrix [nxn] + * @returns 0 successful, other fail on singular input matrix + */ +int gf_invert_matrix(unsigned char *in, unsigned char *out, const int n); + +/*************************************************************/ + +#ifdef __cplusplus +} +#endif + +#endif /* _ERASURE_CODE_H_ */ From patchwork Thu Aug 4 01:38:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935984 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2E195C19F29 for ; Thu, 4 Aug 2022 01:39:15 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyrxG5txjz233W; Wed, 3 Aug 2022 18:39:14 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwm0DfLz23Jh for ; Wed, 3 Aug 2022 18:38:48 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 07CCE100B056; Wed, 3 Aug 2022 21:38:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 060FB8BBFC; Wed, 3 Aug 2022 21:38:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:15 -0400 Message-Id: <1659577097-19253-31-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 30/32] lustre: llite: use max default EA size to get default LMV X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lai Siyao , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Lai Siyao Subdir mount will fetch ROOT default LMV and set it, but the default EA size cl_default_mds_easize may not be set for MDT0 yet, because it's updated upon getattr/enqueue, and if subdir mount is not on MDT0, it may not be initialized yet. Use max EA size to fetch default layout in ll_dir_get_default_layout(). Fixes: 4cee9af853 ("lustre: llite: enforce ROOT default on subdir mount") WC-bug-id: https://jira.whamcloud.com/browse/LU-15910 Lustre-commit: bb588480d4cdd6847 ("LU-15910 llite: use max default EA size to get default LMV") Signed-off-by: Lai Siyao Reviewed-on: https://review.whamcloud.com/47937 Reviewed-by: Andreas Dilger Reviewed-by: Sebastien Buisson Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/dir.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 26c9ec3..3384d81 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -663,17 +663,13 @@ int ll_dir_get_default_layout(struct inode *inode, void **plmm, int *plmm_size, struct mdt_body *body; struct lov_mds_md *lmm = NULL; struct ptlrpc_request *req = NULL; - int rc, lmmsize; + int lmmsize = OBD_MAX_DEFAULT_EA_SIZE; struct md_op_data *op_data; struct lu_fid fid; + int rc; - rc = ll_get_max_mdsize(sbi, &lmmsize); - if (rc) - return rc; - - op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, - 0, lmmsize, LUSTRE_OPC_ANY, - NULL); + op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, 0, lmmsize, + LUSTRE_OPC_ANY, NULL); if (IS_ERR(op_data)) return PTR_ERR(op_data); From patchwork Thu Aug 4 01:38:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935999 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E0711C19F2D for ; Thu, 4 Aug 2022 01:40:07 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyryH3S24z23L6; Wed, 3 Aug 2022 18:40:07 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwm55R4z23K6 for ; Wed, 3 Aug 2022 18:38:48 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 0AB97100B057; Wed, 3 Aug 2022 21:38:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 0928D8D626; Wed, 3 Aug 2022 21:38:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:16 -0400 Message-Id: <1659577097-19253-32-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 31/32] lustre: llite: pass dmv inherit depth instead of dir depth X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lai Siyao , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Lai Siyao In directory creation, once it's ancestor has default LMV, pass the inherit depth, otherwise pass the directory depth to ROOT. This depth will be used in QoS allocation. WC-bug-id: https://jira.whamcloud.com/browse/LU-15850 Lustre-commit: c23c68a52a0436910 ("LU-15850 llite: pass dmv inherit depth instead of dir depth") Signed-off-by: Lai Siyao Reviewed-on: https://review.whamcloud.com/47577 Reviewed-by: Andreas Dilger Reviewed-by: Hongchao Zhang Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_lmv.h | 23 +++++++++++++++++-- fs/lustre/llite/dir.c | 3 ++- fs/lustre/llite/llite_internal.h | 4 ++++ fs/lustre/llite/llite_lib.c | 48 +++++++++++++++++++++++++++++++++++++--- fs/lustre/llite/namei.c | 4 ++-- 5 files changed, 74 insertions(+), 8 deletions(-) diff --git a/fs/lustre/include/lustre_lmv.h b/fs/lustre/include/lustre_lmv.h index cd7cf9e..3720a97 100644 --- a/fs/lustre/include/lustre_lmv.h +++ b/fs/lustre/include/lustre_lmv.h @@ -51,8 +51,6 @@ struct lmv_stripe_md { u32 lsm_md_layout_version; u32 lsm_md_migrate_offset; u32 lsm_md_migrate_hash; - u32 lsm_md_default_count; - u32 lsm_md_default_index; char lsm_md_pool_name[LOV_MAXPOOLNAME + 1]; struct lmv_oinfo lsm_md_oinfo[0]; }; @@ -513,4 +511,25 @@ static inline bool lmv_is_layout_changing(const struct lmv_mds_md_v1 *lmv) lmv_hash_is_migrating(cpu_to_le32(lmv->lmv_hash_type)); } +static inline u8 lmv_inherit_next(u8 inherit) +{ + if (inherit == LMV_INHERIT_END || inherit == LMV_INHERIT_NONE) + return LMV_INHERIT_NONE; + + if (inherit == LMV_INHERIT_UNLIMITED || inherit > LMV_INHERIT_MAX) + return inherit; + + return inherit - 1; +} + +static inline u8 lmv_inherit_rr_next(u8 inherit_rr) +{ + if (inherit_rr == LMV_INHERIT_RR_NONE || + inherit_rr == LMV_INHERIT_RR_UNLIMITED || + inherit_rr > LMV_INHERIT_RR_MAX) + return inherit_rr; + + return inherit_rr - 1; +} + #endif diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 3384d81..aea15f5 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -491,7 +491,8 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump, if (IS_ERR(op_data)) return PTR_ERR(op_data); - op_data->op_dir_depth = ll_i2info(parent)->lli_dir_depth; + op_data->op_dir_depth = ll_i2info(parent)->lli_inherit_depth ?: + ll_i2info(parent)->lli_dir_depth; if (ll_sbi_has_encrypt(sbi) && (IS_ENCRYPTED(parent) || diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index c350440..2139f88 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -183,6 +183,10 @@ struct ll_inode_info { pid_t lli_opendir_pid; /* directory depth to ROOT */ unsigned short lli_dir_depth; + /* directory depth to ancestor whose default LMV is + * inherited. + */ + unsigned short lli_inherit_depth; /* stat will try to access statahead entries or start * statahead if this flag is set, and this flag will be * set upon dir open, and cleared when dir is closed, diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index d947ede..dee2e51 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -1561,6 +1561,7 @@ static void ll_update_default_lsm_md(struct inode *inode, struct lustre_md *md) lmv_free_memmd(lli->lli_default_lsm_md); lli->lli_default_lsm_md = NULL; } + lli->lli_inherit_depth = 0; up_write(&lli->lli_lsm_sem); } return; @@ -2648,9 +2649,34 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md) return 0; } +/* child default LMV is inherited from parent */ +static inline bool ll_default_lmv_inherited(struct lmv_stripe_md *pdmv, + struct lmv_stripe_md *cdmv) +{ + if (!pdmv || !cdmv) + return false; + + if (pdmv->lsm_md_magic != cdmv->lsm_md_magic || + pdmv->lsm_md_stripe_count != cdmv->lsm_md_stripe_count || + pdmv->lsm_md_master_mdt_index != cdmv->lsm_md_master_mdt_index || + pdmv->lsm_md_hash_type != cdmv->lsm_md_hash_type) + return false; + + if (cdmv->lsm_md_max_inherit != + lmv_inherit_next(pdmv->lsm_md_max_inherit)) + return false; + + if (cdmv->lsm_md_max_inherit_rr != + lmv_inherit_rr_next(pdmv->lsm_md_max_inherit_rr)) + return false; + + return true; +} + /* update directory depth to ROOT, called after LOOKUP lock is fetched. */ void ll_update_dir_depth(struct inode *dir, struct inode *inode) { + struct ll_inode_info *plli; struct ll_inode_info *lli; if (!S_ISDIR(inode->i_mode)) @@ -2659,10 +2685,26 @@ void ll_update_dir_depth(struct inode *dir, struct inode *inode) if (inode == dir) return; + plli = ll_i2info(dir); lli = ll_i2info(inode); - lli->lli_dir_depth = ll_i2info(dir)->lli_dir_depth + 1; - CDEBUG(D_INODE, DFID" depth %hu\n", - PFID(&lli->lli_fid), lli->lli_dir_depth); + lli->lli_dir_depth = plli->lli_dir_depth + 1; + if (plli->lli_default_lsm_md && lli->lli_default_lsm_md) { + down_read(&plli->lli_lsm_sem); + down_read(&lli->lli_lsm_sem); + if (ll_default_lmv_inherited(plli->lli_default_lsm_md, + lli->lli_default_lsm_md)) + lli->lli_inherit_depth = + plli->lli_inherit_depth + 1; + else + lli->lli_inherit_depth = 0; + up_read(&lli->lli_lsm_sem); + up_read(&plli->lli_lsm_sem); + } else { + lli->lli_inherit_depth = 0; + } + + CDEBUG(D_INODE, DFID" depth %hu default LMV depth %hu\n", + PFID(&lli->lli_fid), lli->lli_dir_depth, lli->lli_inherit_depth); } void ll_truncate_inode_pages_final(struct inode *inode) diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c index cc7b243..2215dd8 100644 --- a/fs/lustre/llite/namei.c +++ b/fs/lustre/llite/namei.c @@ -1496,7 +1496,7 @@ static void ll_qos_mkdir_prep(struct md_op_data *op_data, struct inode *dir) struct ll_inode_info *lli = ll_i2info(dir); struct lmv_stripe_md *lsm; - op_data->op_dir_depth = lli->lli_dir_depth; + op_data->op_dir_depth = lli->lli_inherit_depth ?: lli->lli_dir_depth; /* parent directory is striped */ if (unlikely(lli->lli_lsm_md)) @@ -1635,7 +1635,7 @@ static int ll_new_node(struct inode *dir, struct dentry *dchild, from_kuid(&init_user_ns, current_fsuid()), from_kgid(&init_user_ns, current_fsgid()), current_cap(), rdev, &request); -#if OBD_OCD_VERSION(2, 14, 58, 0) > LUSTRE_VERSION_CODE +#if OBD_OCD_VERSION(2, 14, 58, 0) < LUSTRE_VERSION_CODE /* * server < 2.12.58 doesn't pack default LMV in intent_getattr reply, * fetch default LMV here. From patchwork Thu Aug 4 01:38:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 12935996 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 758F0C19F29 for ; Thu, 4 Aug 2022 01:40:03 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4LyryB4gRTz23M1; Wed, 3 Aug 2022 18:40:02 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Lyrwn3SlPz23KB for ; Wed, 3 Aug 2022 18:38:49 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 0DFB0100B058; Wed, 3 Aug 2022 21:38:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 0CC128D620; Wed, 3 Aug 2022 21:38:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 3 Aug 2022 21:38:17 -0400 Message-Id: <1659577097-19253-33-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> References: <1659577097-19253-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 32/32] lustre: ldlm: Prioritize blocking callbacks X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Patrick Farrell The current code places bl_ast lock callbacks at the end of the global BL callback queue. This is bad because it causes urgent requests from the server to wait behind non-urgent cleanup tasks to keep lru_size at the right level. This can lead to evictions if there is a large queue of items in the global queue so the callback is not serviced in a timely manner. Put bl_ast callbacks on the priority queue so they do not wait behind the background traffic. Add some additional debug in this area. WC-bug-id: https://jira.whamcloud.com/browse/LU-15821 Lustre-commit: 2d59294d52b696125 ("LU-15821 ldlm: Prioritize blocking callbacks") Signed-off-by: Patrick Farrell Reviewed-on: https://review.whamcloud.com/47215 Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ldlm/ldlm_lockd.c | 39 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/fs/lustre/ldlm/ldlm_lockd.c b/fs/lustre/ldlm/ldlm_lockd.c index 04fe92e..9f89766 100644 --- a/fs/lustre/ldlm/ldlm_lockd.c +++ b/fs/lustre/ldlm/ldlm_lockd.c @@ -94,6 +94,8 @@ struct ldlm_bl_pool { atomic_t blp_busy_threads; int blp_min_threads; int blp_max_threads; + int blp_total_locks; + int blp_total_blwis; }; struct ldlm_bl_work_item { @@ -399,19 +401,39 @@ static int __ldlm_bl_to_thread(struct ldlm_bl_work_item *blwi, enum ldlm_cancel_flags cancel_flags) { struct ldlm_bl_pool *blp = ldlm_state->ldlm_bl_pool; + char *prio = "regular"; + int count; spin_lock(&blp->blp_lock); - if (blwi->blwi_lock && ldlm_is_discard_data(blwi->blwi_lock)) { - /* add LDLM_FL_DISCARD_DATA requests to the priority list */ + /* cannot access blwi after added to list and lock is dropped */ + count = blwi->blwi_lock ? 1 : blwi->blwi_count; + + /* if the server is waiting on a lock to be cancelled (bl_ast), this is + * an urgent request and should go in the priority queue so it doesn't + * get stuck behind non-priority work (eg, lru size management) + * + * We also prioritize discard_data, which is for eviction handling + */ + if (blwi->blwi_lock && + (ldlm_is_discard_data(blwi->blwi_lock) || + ldlm_is_bl_ast(blwi->blwi_lock))) { list_add_tail(&blwi->blwi_entry, &blp->blp_prio_list); + prio = "priority"; } else { /* other blocking callbacks are added to the regular list */ list_add_tail(&blwi->blwi_entry, &blp->blp_list); } + blp->blp_total_locks += count; + blp->blp_total_blwis++; spin_unlock(&blp->blp_lock); wake_up(&blp->blp_waitq); + /* unlocked read of blp values is intentional - OK for debug */ + CDEBUG(D_DLMTRACE, + "added %d/%d locks to %s blp list, %d blwis in pool\n", + count, blp->blp_total_locks, prio, blp->blp_total_blwis); + /* * Can not check blwi->blwi_flags as blwi could be already freed in * LCF_ASYNC mode @@ -772,6 +794,17 @@ static int ldlm_bl_get_work(struct ldlm_bl_pool *blp, spin_unlock(&blp->blp_lock); *p_blwi = blwi; + /* intentional unlocked read of blp values - OK for debug */ + if (blwi) { + CDEBUG(D_DLMTRACE, + "Got %d locks of %d total in blp. (%d blwis in pool)\n", + blwi->blwi_lock ? 1 : blwi->blwi_count, + blp->blp_total_locks, blp->blp_total_blwis); + } else { + CDEBUG(D_DLMTRACE, + "No blwi found in queue (no bl locks in queue)\n"); + } + return (*p_blwi || *p_exp) ? 1 : 0; } @@ -1126,6 +1159,8 @@ static int ldlm_setup(void) init_waitqueue_head(&blp->blp_waitq); atomic_set(&blp->blp_num_threads, 0); atomic_set(&blp->blp_busy_threads, 0); + blp->blp_total_locks = 0; + blp->blp_total_blwis = 0; if (ldlm_num_threads == 0) { blp->blp_min_threads = LDLM_NTHRS_INIT;