From patchwork Sun Oct 16 12:30:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Pavlu X-Patchwork-Id: 13007768 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61961C4332F for ; Sun, 16 Oct 2022 12:31:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229607AbiJPMbU (ORCPT ); Sun, 16 Oct 2022 08:31:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229663AbiJPMbS (ORCPT ); Sun, 16 Oct 2022 08:31:18 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B010331DC2; Sun, 16 Oct 2022 05:31:17 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 65BAA33A43; Sun, 16 Oct 2022 12:31:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1665923476; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fb+yTSMK3jSZs82B8z4U15J58ZeLRAv+ql0OMM3ibe0=; b=dWNXiW4xhEUBvjoCNF1MGYqLIjJ8eRdtHvra0XKja8by17o1yRyDmBjpVeEJ2nGOKPXXca 5+z/V6wdRRnju3MucKRNIHmf+PNHch7qZsJxTjVdDSm1P7kcFnkNEq/4U4ZyqQH6aPiSpD k57gbxuBI50O89ItIbXQujOIucMyOuM= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 40A6113A36; Sun, 16 Oct 2022 12:31:16 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 6CvJDpT5S2MyMAAAMHmgww (envelope-from ); Sun, 16 Oct 2022 12:31:16 +0000 From: Petr Pavlu To: mcgrof@kernel.org Cc: pmladek@suse.com, david@redhat.com, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, Petr Pavlu Subject: [PATCH v3 1/4] module: Correct wake up of module_wq Date: Sun, 16 Oct 2022 14:30:28 +0200 Message-Id: <20221016123031.3963-2-petr.pavlu@suse.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20221016123031.3963-1-petr.pavlu@suse.com> References: <20221016123031.3963-1-petr.pavlu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: The module_wq wait queue has only non-exclusive waiters and all waits are interruptible, therefore for consistency use wake_up_interruptible() to wake its waiters. Suggested-by: Petr Mladek Reviewed-by: Petr Mladek Signed-off-by: Petr Pavlu Reviewed-by: David Hildenbrand --- kernel/module/main.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/module/main.c b/kernel/module/main.c index d02d39c7174e..a12e177ea81f 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -764,7 +764,7 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user, free_module(mod); /* someone could wait for the module in add_unformed_module() */ - wake_up_all(&module_wq); + wake_up_interruptible(&module_wq); return 0; out: mutex_unlock(&module_mutex); @@ -2522,7 +2522,7 @@ static noinline int do_init_module(struct module *mod) schedule_work(&init_free_wq); mutex_unlock(&module_mutex); - wake_up_all(&module_wq); + wake_up_interruptible(&module_wq); return 0; @@ -2538,7 +2538,7 @@ static noinline int do_init_module(struct module *mod) klp_module_going(mod); ftrace_release_mod(mod); free_module(mod); - wake_up_all(&module_wq); + wake_up_interruptible(&module_wq); return ret; } @@ -2879,7 +2879,7 @@ static int load_module(struct load_info *info, const char __user *uargs, /* Unlink carefully: kallsyms could be walking list. */ list_del_rcu(&mod->list); mod_tree_remove(mod); - wake_up_all(&module_wq); + wake_up_interruptible(&module_wq); /* Wait for RCU-sched synchronizing before releasing mod->list. */ synchronize_rcu(); mutex_unlock(&module_mutex); From patchwork Sun Oct 16 12:30:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Pavlu X-Patchwork-Id: 13007769 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0CB9C4332F for ; Sun, 16 Oct 2022 12:31:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229780AbiJPMbV (ORCPT ); Sun, 16 Oct 2022 08:31:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229769AbiJPMbU (ORCPT ); Sun, 16 Oct 2022 08:31:20 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CAB036845; Sun, 16 Oct 2022 05:31:19 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id EA0B633A46; Sun, 16 Oct 2022 12:31:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1665923477; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qD3P5jMHsGMDFWM0XMYxlTpTliGBHcgI0IoO+d5uMKw=; b=bs65FvbUgTVRWxu05wBg6EcttKoaGu4Sl0WaKZCRNu0yBZCRd/0ZD6Pb3eMfVGEz+kjWnA cxDRF5w5PyxtoTkL5oHlJ4dZcUJ3OYN5Mde52jKaP0nxwOBW5ef/omAFddoEGidv7csQ+X JTozn+2vFRFLGTbz6p0qvbiap9r4fg0= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BA42813A36; Sun, 16 Oct 2022 12:31:17 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id iE2PLJX5S2MyMAAAMHmgww (envelope-from ); Sun, 16 Oct 2022 12:31:17 +0000 From: Petr Pavlu To: mcgrof@kernel.org Cc: pmladek@suse.com, david@redhat.com, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, Petr Pavlu Subject: [PATCH v3 2/4] module: Update a comment describing what is protected by module_mutex Date: Sun, 16 Oct 2022 14:30:29 +0200 Message-Id: <20221016123031.3963-3-petr.pavlu@suse.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20221016123031.3963-1-petr.pavlu@suse.com> References: <20221016123031.3963-1-petr.pavlu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: Signed-off-by: Petr Pavlu Reviewed-by: David Hildenbrand Reviewed-by: Petr Mladek --- kernel/module/main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/module/main.c b/kernel/module/main.c index a12e177ea81f..5288843ca40f 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -62,10 +62,11 @@ /* * Mutex protects: - * 1) List of modules (also safely readable with preempt_disable), + * 1) list of modules (also safely readable with preempt_disable, delete and add + * uses RCU list operations). * 2) module_use links, - * 3) mod_tree.addr_min/mod_tree.addr_max. - * (delete and add uses RCU list operations). + * 3) mod_tree.addr_min/mod_tree.addr_max, + * 4) list of unloaded_tainted_modules. */ DEFINE_MUTEX(module_mutex); LIST_HEAD(modules); From patchwork Sun Oct 16 12:30:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Pavlu X-Patchwork-Id: 13007770 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20D08C433FE for ; Sun, 16 Oct 2022 12:31:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229600AbiJPMb2 (ORCPT ); Sun, 16 Oct 2022 08:31:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229785AbiJPMbX (ORCPT ); Sun, 16 Oct 2022 08:31:23 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6B1713E20; Sun, 16 Oct 2022 05:31:20 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 2FDE733A4A; Sun, 16 Oct 2022 12:31:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1665923479; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nTGbiB9TbHuxNHIem9y1YK9mSJ3fQ2drG27RNZYoZp4=; b=HJiIoiMsCxGcVjATOLp6TkOxDFY0YTM420Pqw+kyRFDoRUNMitSRwNKJJBT6DfHPIx+sXP 64tWfh46Z3RvNCoyc2bqetprBWZnGc0JxG1hWOLe8N0YV4ND23DmWoTR7NepaFkM2ACHOg K3jvR+L1MwYIiWkvZLVOoxguUCqGE54= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 094B513A36; Sun, 16 Oct 2022 12:31:19 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id aKQ9AZf5S2MyMAAAMHmgww (envelope-from ); Sun, 16 Oct 2022 12:31:19 +0000 From: Petr Pavlu To: mcgrof@kernel.org Cc: pmladek@suse.com, david@redhat.com, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, Petr Pavlu Subject: [PATCH v3 3/4] module: Merge same-name module load requests Date: Sun, 16 Oct 2022 14:30:30 +0200 Message-Id: <20221016123031.3963-4-petr.pavlu@suse.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20221016123031.3963-1-petr.pavlu@suse.com> References: <20221016123031.3963-1-petr.pavlu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: During a system boot, it can happen that the kernel receives a burst of requests to insert the same module but loading it eventually fails during its init call. For instance, udev can make a request to insert a frequency module for each individual CPU when another frequency module is already loaded which causes the init function of the new module to return an error. The module loader currently serializes all such requests, with the barrier in add_unformed_module(). This creates a lot of unnecessary work and delays the boot. It can prevent udev from loading drivers for other devices and might cause timeouts of services waiting on them and subsequently a failed boot. The mentioned serialization was introduced as a side-effect of commit 6e6de3dee51a. The kernel before that merged some of same load requests although it was more by accident and relied on specific timing. The patch brings this behavior back in a more explicit form. The logic is improved as follows: * A check whether a module load matches an already loaded module is moved right after a module name is determined. -EEXIST continues to be returned if the module exists and is live, -EBUSY is returned if a same-name module is going. * A new reference-counted shared_load_info structure is introduced to keep track of duplicate load requests. Two loads are considered equivalent if their module name matches. In case a load duplicates another running insert, the code waits for its completion and then returns -EEXIST or -EBUSY depending on whether it succeeded. Moving the check for same-name module loads earlier has also a positive effect on reducing memory pressure. For instance, David Hildenbrand and Lin Liu reported [1] that when KASAN_INLINE is enabled (resulting in large module size), with plenty of devices that udev wants to probe and with plenty of CPUs that can carry out that probing concurrently, the system can actually run out of module vmap space and trigger vmap allocation errors. This is fixed by the patch too as it avoids duplicate layout_and_allocate() work. [1] https://lore.kernel.org/all/20221013180518.217405-1-david@redhat.com/ Fixes: 6e6de3dee51a ("kernel/module.c: Only return -EEXIST for modules that have finished loading") Reviewed-by: Petr Mladek Signed-off-by: Petr Pavlu Signed-off-by: author of the patch Acked-by: in order of appearance Tested-by: in order of appearance Signed-off-by: committer (upstream maintainer) Reported-by: Lin Liu Reported-by: David Hildenbrand --- kernel/module/main.c | 217 ++++++++++++++++++++++++++++++------------- 1 file changed, 155 insertions(+), 62 deletions(-) diff --git a/kernel/module/main.c b/kernel/module/main.c index 5288843ca40f..2228c0f725e7 100644 --- a/kernel/module/main.c +++ b/kernel/module/main.c @@ -66,11 +66,28 @@ * uses RCU list operations). * 2) module_use links, * 3) mod_tree.addr_min/mod_tree.addr_max, - * 4) list of unloaded_tainted_modules. + * 4) list of unloaded_tainted_modules, + * 5) list of running_loads. */ DEFINE_MUTEX(module_mutex); LIST_HEAD(modules); +/* Shared information to track duplicate module loads. */ +struct shared_load_info { + char name[MODULE_NAME_LEN]; + refcount_t refcnt; + struct list_head list; + struct completion done; + int err; +}; +static LIST_HEAD(running_loads); + +/* + * Waiting for a module load when the exact module name is not known, for + * example, when resolving symbols from another modules. + */ +static DECLARE_WAIT_QUEUE_HEAD(module_wq); + /* Work queue for freeing init sections in success case */ static void do_free_init(struct work_struct *w); static DECLARE_WORK(init_free_wq, do_free_init); @@ -124,9 +141,6 @@ static void mod_update_bounds(struct module *mod) int modules_disabled; core_param(nomodule, modules_disabled, bint, 0); -/* Waiting for a module to finish initializing? */ -static DECLARE_WAIT_QUEUE_HEAD(module_wq); - static BLOCKING_NOTIFIER_HEAD(module_notify_list); int register_module_notifier(struct notifier_block *nb) @@ -764,8 +778,6 @@ SYSCALL_DEFINE2(delete_module, const char __user *, name_user, strscpy(last_unloaded_module.taints, module_flags(mod, buf, false), sizeof(last_unloaded_module.taints)); free_module(mod); - /* someone could wait for the module in add_unformed_module() */ - wake_up_interruptible(&module_wq); return 0; out: mutex_unlock(&module_mutex); @@ -2373,26 +2385,6 @@ static int post_relocation(struct module *mod, const struct load_info *info) return module_finalize(info->hdr, info->sechdrs, mod); } -/* Is this module of this name done loading? No locks held. */ -static bool finished_loading(const char *name) -{ - struct module *mod; - bool ret; - - /* - * The module_mutex should not be a heavily contended lock; - * if we get the occasional sleep here, we'll go an extra iteration - * in the wait_event_interruptible(), which is harmless. - */ - sched_annotate_sleep(); - mutex_lock(&module_mutex); - mod = find_module_all(name, strlen(name), true); - ret = !mod || mod->state == MODULE_STATE_LIVE; - mutex_unlock(&module_mutex); - - return ret; -} - /* Call module constructors. */ static void do_mod_ctors(struct module *mod) { @@ -2523,7 +2515,6 @@ static noinline int do_init_module(struct module *mod) schedule_work(&init_free_wq); mutex_unlock(&module_mutex); - wake_up_interruptible(&module_wq); return 0; @@ -2539,7 +2530,6 @@ static noinline int do_init_module(struct module *mod) klp_module_going(mod); ftrace_release_mod(mod); free_module(mod); - wake_up_interruptible(&module_wq); return ret; } @@ -2551,43 +2541,138 @@ static int may_init_module(void) return 0; } +static struct shared_load_info * +shared_load_info_alloc(const struct load_info *info) +{ + struct shared_load_info *shared_info = + kzalloc(sizeof(*shared_info), GFP_KERNEL); + if (shared_info == NULL) + return ERR_PTR(-ENOMEM); + + strscpy(shared_info->name, info->name, sizeof(shared_info->name)); + refcount_set(&shared_info->refcnt, 1); + INIT_LIST_HEAD(&shared_info->list); + init_completion(&shared_info->done); + return shared_info; +} + +static void shared_load_info_get(struct shared_load_info *shared_info) +{ + refcount_inc(&shared_info->refcnt); +} + +static void shared_load_info_put(struct shared_load_info *shared_info) +{ + if (refcount_dec_and_test(&shared_info->refcnt)) + kfree(shared_info); +} + /* - * We try to place it in the list now to make sure it's unique before - * we dedicate too many resources. In particular, temporary percpu + * Check that a module load is unique and make it visible to others. The code + * looks for parallel running inserts and already loaded modules. Two inserts + * are considered equivalent if their module name matches. In case this load + * duplicates another running insert, the code waits for its completion and + * then returns -EEXIST or -EBUSY depending on whether it succeeded. + * + * Detecting early that a load is unique avoids dedicating too many cycles and + * resources to bring up the module. In particular, it prevents temporary percpu * memory exhaustion. + * + * Merging same load requests then primarily helps during the boot process. It + * can happen that the kernel receives a burst of requests to load the same + * module (for example, a same module for each individual CPU) and loading it + * eventually fails during its init call. Merging the requests allows that only + * one full attempt to load the module is made. + * + * On a non-error return, it is guaranteed that this load is unique. */ -static int add_unformed_module(struct module *mod) +static struct shared_load_info *add_running_load(const struct load_info *info) { - int err; struct module *old; + struct shared_load_info *shared_info; - mod->state = MODULE_STATE_UNFORMED; - -again: mutex_lock(&module_mutex); - old = find_module_all(mod->name, strlen(mod->name), true); - if (old != NULL) { - if (old->state != MODULE_STATE_LIVE) { - /* Wait in case it fails to load. */ + + /* Search if there is a running load of a module with the same name. */ + list_for_each_entry(shared_info, &running_loads, list) + if (strcmp(shared_info->name, info->name) == 0) { + int err; + + shared_load_info_get(shared_info); mutex_unlock(&module_mutex); - err = wait_event_interruptible(module_wq, - finished_loading(mod->name)); - if (err) - goto out_unlocked; - goto again; + + err = wait_for_completion_interruptible( + &shared_info->done); + /* + * Return -EBUSY when the parallel load failed for any + * reason. This load might end up another way but we are + * not going to try. + */ + if (!err) + err = shared_info->err ? -EBUSY : -EEXIST; + shared_load_info_put(shared_info); + shared_info = ERR_PTR(err); + goto out_unlocked; + } + + /* Search if there is a live module with the given name already. */ + old = find_module_all(info->name, strlen(info->name), true); + if (old != NULL) { + if (old->state == MODULE_STATE_LIVE) { + shared_info = ERR_PTR(-EEXIST); + goto out; } - err = -EEXIST; + + /* + * Any active load always has its record in running_loads and so + * would be found above. This applies independent whether such + * a module is currently in MODULE_STATE_UNFORMED, + * MODULE_STATE_COMING, or even in MODULE_STATE_GOING if its + * initialization failed. It therefore means this must be an + * older going module and the caller should try later once it is + * gone. + */ + WARN_ON(old->state != MODULE_STATE_GOING); + shared_info = ERR_PTR(-EBUSY); goto out; } - mod_update_bounds(mod); - list_add_rcu(&mod->list, &modules); - mod_tree_insert(mod); - err = 0; + + /* The load is unique, make it visible to others. */ + shared_info = shared_load_info_alloc(info); + if (IS_ERR(shared_info)) + goto out; + list_add(&shared_info->list, &running_loads); out: mutex_unlock(&module_mutex); out_unlocked: - return err; + return shared_info; +} + +static void finalize_running_load(struct shared_load_info *shared_info, int err) +{ + /* Inform other duplicate inserts that the load finished. */ + mutex_lock(&module_mutex); + list_del(&shared_info->list); + shared_info->err = err; + mutex_unlock(&module_mutex); + + complete_all(&shared_info->done); + shared_load_info_put(shared_info); + + /* Tell other modules waiting on this one that it completed loading. */ + wake_up_interruptible(&module_wq); +} + +static void add_unformed_module(struct module *mod) +{ + mod->state = MODULE_STATE_UNFORMED; + + mutex_lock(&module_mutex); + mod_update_bounds(mod); + list_add_rcu(&mod->list, &modules); + mod_tree_insert(mod); + mutex_unlock(&module_mutex); } static int complete_formation(struct module *mod, struct load_info *info) @@ -2672,6 +2757,7 @@ static int unknown_module_param_cb(char *param, char *val, const char *modname, static int load_module(struct load_info *info, const char __user *uargs, int flags) { + struct shared_load_info *shared_info; struct module *mod; long err = 0; char *after_dashes; @@ -2709,38 +2795,43 @@ static int load_module(struct load_info *info, const char __user *uargs, goto free_copy; /* - * Now that we know we have the correct module name, check - * if it's blacklisted. + * Now that we know we have the correct module name, check if there is + * another load of the same name in progress. */ + shared_info = add_running_load(info); + if (IS_ERR(shared_info)) { + err = PTR_ERR(shared_info); + goto free_copy; + } + + /* Check if the module is blacklisted. */ if (blacklisted(info->name)) { err = -EPERM; pr_err("Module %s is blacklisted\n", info->name); - goto free_copy; + goto free_shared; } err = rewrite_section_headers(info, flags); if (err) - goto free_copy; + goto free_shared; /* Check module struct version now, before we try to use module. */ if (!check_modstruct_version(info, info->mod)) { err = -ENOEXEC; - goto free_copy; + goto free_shared; } /* Figure out module layout, and allocate all the memory. */ mod = layout_and_allocate(info, flags); if (IS_ERR(mod)) { err = PTR_ERR(mod); - goto free_copy; + goto free_shared; } audit_log_kern_module(mod->name); /* Reserve our place in the list. */ - err = add_unformed_module(mod); - if (err) - goto free_module; + add_unformed_module(mod); #ifdef CONFIG_MODULE_SIG mod->sig_ok = info->sig_ok; @@ -2847,7 +2938,9 @@ static int load_module(struct load_info *info, const char __user *uargs, /* Done! */ trace_module_load(mod); - return do_init_module(mod); + err = do_init_module(mod); + finalize_running_load(shared_info, err); + return err; sysfs_cleanup: mod_sysfs_teardown(mod); @@ -2880,15 +2973,15 @@ static int load_module(struct load_info *info, const char __user *uargs, /* Unlink carefully: kallsyms could be walking list. */ list_del_rcu(&mod->list); mod_tree_remove(mod); - wake_up_interruptible(&module_wq); /* Wait for RCU-sched synchronizing before releasing mod->list. */ synchronize_rcu(); mutex_unlock(&module_mutex); - free_module: /* Free lock-classes; relies on the preceding sync_rcu() */ lockdep_free_key_range(mod->data_layout.base, mod->data_layout.size); module_deallocate(mod, info); + free_shared: + finalize_running_load(shared_info, err); free_copy: free_copy(info, flags); return err; From patchwork Sun Oct 16 12:30:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Pavlu X-Patchwork-Id: 13007771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8702CC43217 for ; Sun, 16 Oct 2022 12:31:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229712AbiJPMbd (ORCPT ); Sun, 16 Oct 2022 08:31:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229797AbiJPMb2 (ORCPT ); Sun, 16 Oct 2022 08:31:28 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5EDB031DC2; Sun, 16 Oct 2022 05:31:22 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id D0CAA20479; Sun, 16 Oct 2022 12:31:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1665923480; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PYqXE6gTN7KoVw9OC/xhUSUlvDrYiQN8v/Jf2xRpmok=; b=sgmUHdwlGUrFxU0If1TzdLxN4u/ZkMibWP6aaObTaGTAYogCLf0WWg90MyQrOfomrSCf+Y upV0alFTcOAu52NJ4MzMlvra4MSTijbMOT/8MYaC/LLfhIyrj16raLLGVBmRTNST3oUJVM BJ6oWRKJZQeGtBYu7B0Re7Zdy1VAF3Q= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id A884513A36; Sun, 16 Oct 2022 12:31:20 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id YC8+KJj5S2MyMAAAMHmgww (envelope-from ); Sun, 16 Oct 2022 12:31:20 +0000 From: Petr Pavlu To: mcgrof@kernel.org Cc: pmladek@suse.com, david@redhat.com, linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org, Petr Pavlu Subject: [PATCH v3 4/4] selftests: kmod: Add tests for merging same-name module load requests Date: Sun, 16 Oct 2022 14:30:31 +0200 Message-Id: <20221016123031.3963-5-petr.pavlu@suse.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20221016123031.3963-1-petr.pavlu@suse.com> References: <20221016123031.3963-1-petr.pavlu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: Add two tests to check that loading the same module multiple times in parallel results only in one real attempt to initialize it. Synchronization of the loads is done by waiting 1000 ms in the init function of a sample module kmod_test_0014. The tests measure time needed to perform all inserts to verify that the loads get merged by the module loader and are not serialized. * Case 0014 checks a situation when the load is successful. It should result in one insert returning 0 and remaining inserts returning EEXIST. * Case 0015 checks a situation when the load is failing because the module init function returns ENODEV. It should result in one insert returning this error code and remaining inserts returning EBUSY. The tests use a simple init_module program to load kmod_test_0014.ko. It enables to obtain directly a return code from the finit_module syscall. Signed-off-by: Petr Pavlu --- tools/testing/selftests/kmod/.gitignore | 1 + tools/testing/selftests/kmod/Makefile | 17 ++- tools/testing/selftests/kmod/init_module.c | 49 ++++++ tools/testing/selftests/kmod/kmod.sh | 139 ++++++++++++++++++ .../selftests/kmod/kmod_test_0014/Makefile | 14 ++ .../kmod/kmod_test_0014/kmod_test_0014.c | 29 ++++ 6 files changed, 244 insertions(+), 5 deletions(-) create mode 100644 tools/testing/selftests/kmod/.gitignore create mode 100644 tools/testing/selftests/kmod/init_module.c create mode 100644 tools/testing/selftests/kmod/kmod_test_0014/Makefile create mode 100644 tools/testing/selftests/kmod/kmod_test_0014/kmod_test_0014.c diff --git a/tools/testing/selftests/kmod/.gitignore b/tools/testing/selftests/kmod/.gitignore new file mode 100644 index 000000000000..ea3afcaccc79 --- /dev/null +++ b/tools/testing/selftests/kmod/.gitignore @@ -0,0 +1 @@ +init_module diff --git a/tools/testing/selftests/kmod/Makefile b/tools/testing/selftests/kmod/Makefile index 5b3e746a0bee..48aeaae76211 100644 --- a/tools/testing/selftests/kmod/Makefile +++ b/tools/testing/selftests/kmod/Makefile @@ -1,12 +1,19 @@ # SPDX-License-Identifier: GPL-2.0-only # Makefile for kmod loading selftests -# No binaries, but make sure arg-less "make" doesn't trigger "run_tests" -all: - TEST_PROGS := kmod.sh +# compile but not part of 'make run_tests' +TEST_GEN_PROGS_EXTENDED := init_module kmod_test_0014.ko + +# override lib.mk's default rule +override define CLEAN + $(RM) $(TEST_GEN_PROGS_EXTENDED) + $(MAKE) -C kmod_test_0014 clean +endef + include ../lib.mk -# Nothing to clean up. -clean: +$(OUTPUT)/kmod_test_0014.ko: $(wildcard kmod_test_0014/Makefile kmod_test_0014/*.[ch]) + $(MAKE) -C kmod_test_0014 + cp kmod_test_0014/kmod_test_0014.ko $@ diff --git a/tools/testing/selftests/kmod/init_module.c b/tools/testing/selftests/kmod/init_module.c new file mode 100644 index 000000000000..529abd0a8357 --- /dev/null +++ b/tools/testing/selftests/kmod/init_module.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Simple program to insert a module, similar to insmod but tailored + * specifically for needs of module loader tests. + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include + +int main(int argc, char *argv[]) +{ + const char *filename, *args; + int fd, ret; + + if (argc < 2 || argc > 3) { + fprintf(stderr, "usage: %s filename [args]\n", argv[0]); + return EINVAL; + } + + filename = argv[1]; + args = argc > 2 ? argv[2] : ""; + + fd = open(filename, O_RDONLY); + if (fd == -1) { + ret = errno; + fprintf(stderr, "init_module: could not open file %s: %s\n", + filename, strerror(ret)); + return ret; + } + + ret = syscall(SYS_finit_module, fd, args, 0); + if (ret != 0) { + ret = errno; + fprintf(stderr, "init_module: could not load module %s: %s\n", + filename, strerror(ret)); + } + + close(fd); + + return ret; +} diff --git a/tools/testing/selftests/kmod/kmod.sh b/tools/testing/selftests/kmod/kmod.sh index afd42387e8b2..8836d7fe1a0a 100755 --- a/tools/testing/selftests/kmod/kmod.sh +++ b/tools/testing/selftests/kmod/kmod.sh @@ -65,6 +65,8 @@ ALL_TESTS="$ALL_TESTS 0010:1:1" ALL_TESTS="$ALL_TESTS 0011:1:1" ALL_TESTS="$ALL_TESTS 0012:1:1" ALL_TESTS="$ALL_TESTS 0013:1:1" +ALL_TESTS="$ALL_TESTS 0014:1:1" +ALL_TESTS="$ALL_TESTS 0015:1:1" # Kselftest framework requirement - SKIP code is 4. ksft_skip=4 @@ -171,6 +173,12 @@ errno_name_to_val() echo -1;; -ENOENT) echo -2;; + -EBUSY) + echo -16;; + -EEXIST) + echo -17;; + -ENODEV) + echo -19;; -EINVAL) echo -22;; -ERR_ANY) @@ -190,6 +198,12 @@ errno_val_to_name() echo -EPERM;; -2) echo -ENOENT;; + -16) + echo -EBUSY;; + -17) + echo -EEXIST;; + -19) + echo -ENODEV;; -22) echo -EINVAL;; -123456) @@ -504,6 +518,129 @@ kmod_test_0013() "cat /sys/module/${DEFAULT_KMOD_DRIVER}/sections/.*text | head -n1" } +# Test that loading the same module multiple times in parallel results only in +# one real attempt to initialize it. Synchronization of the loads is done by +# waiting 1000 ms in the init function of a sample module kmod_test_0014. Tests +# measure time needed to perform all inserts to verify that the loads get merged +# by the module loader and are not serialized. +# +# * Case 0014 checks a situation when the load is successful. It should result +# in one insert returning 0 and remaining inserts returning EEXIST. +# * Case 0015 checks a situation when the load is failing because the module +# init function returns ENODEV. It should result in one insert returning this +# error code and remaining inserts returning EBUSY. +kmod_check_parallel_loads() +{ + local test_name="$1" + local parallel_loads="$2" + local insmod_params="$3" + local max_time="$4" + local success_exp="$5" + local ebusy_exp="$6" + local eexist_exp="$7" + local enodev_exp="$8" + local other_exp="$9" + + local EBUSY=$(errno_name_to_val -EBUSY) + local EEXIST=$(errno_name_to_val -EEXIST) + local ENODEV=$(errno_name_to_val -ENODEV) + local start end elapsed pids + local success_cnt=0 ebusy_cnt=0 eexist_cnt=0 enodev_cnt=0 other_cnt=0 + local test_ok=true + + # Do a dummy insert to get the file into the page cache. + ./init_module kmod_test_0014.ko "sleep_msecs=0 retval=0" + rmmod kmod_test_0014 + + # Run the parallel loads. + let start=$(date +%s%N)/1000000 + + for i in $(seq 0 $(($parallel_loads - 1))); do + ./init_module kmod_test_0014.ko "$insmod_params" & + pids[$i]=$! + done + + for pid in ${pids[*]}; do + local rc + { wait $pid; let rc=-$?; } || true + + case $rc in + 0) + let success_cnt=$success_cnt+1;; + $EBUSY) + let ebusy_cnt=$ebusy_cnt+1;; + $EEXIST) + let eexist_cnt=$eexist_cnt+1;; + $ENODEV) + let enodev_cnt=$enodev_cnt+1;; + *) + let other_cnt=$other_cnt+1;; + esac + done + + let end=$(date +%s%N)/1000000 + let elapsed=$end-$start + + if (( $success_cnt > 0 )); then + rmmod kmod_test_0014 + fi + + # Check the results. + if (( $elapsed > $max_time )); then + echo "$test_name: FAIL, time to load the modules is within limit, test expects '<$max_time' ms - got '$elapsed' ms" >&2 + test_ok=false + else + echo "$test_name: OK! - time to load the modules is within limit, test expects '<$max_time' ms - got '$elapsed' ms" >&2 + fi + if (( $success_cnt != $success_exp )); then + echo "$test_name: FAIL, number of successful loads, test expects '$success_exp' - got '$success_cnt'" >&2 + test_ok=false + else + echo "$test_name: OK! - number of successful loads, test expects '$success_exp' - got '$success_cnt'" >&2 + fi + if (( $ebusy_cnt != $ebusy_exp )); then + echo "$test_name: FAIL, number of loads returning EBUSY, test expects '$ebusy_exp' - got '$ebusy_cnt'" >&2 + test_ok=false + else + echo "$test_name: OK! - number of loads returning EBUSY, test expects '$ebusy_exp' - got '$ebusy_cnt'" >&2 + fi + if (( $eexist_cnt != $eexist_exp )); then + echo "$test_name: FAIL, number of loads returning EEXIST, test expects '$eexist_exp' - got '$eexist_cnt'" >&2 + test_ok=false + else + echo "$test_name: OK! - number of loads returning EEXIST, test expects '$eexist_exp' - got '$eexist_cnt'" >&2 + fi + if (( $enodev_cnt != $enodev_exp )); then + echo "$test_name: FAIL, number of loads returning ENODEV, test expects '$enodev_exp' - got '$enodev_cnt'" >&2 + test_ok=false + else + echo "$test_name: OK! - number of loads returning ENODEV, test expects '$enodev_exp' - got '$enodev_cnt'" >&2 + fi + if (( $other_cnt != $other_exp )); then + echo "$test_name: FAIL, number of loads returning other values, test expects '$other_exp' - got '$other_cnt'" >&2 + test_ok=false + else + echo "$test_name: OK! - number of loads returning other values, test expects '$other_exp' - got '$other_cnt'" >&2 + fi + + [ $test_ok = true ] || exit 1 +} + +kmod_test_0014() +{ + kmod_check_parallel_loads \ + "${FUNCNAME[0]}" 4 "sleep_msecs=1000 retval=0" 3000 \ + 1 0 3 0 0 +} + +kmod_test_0015() +{ + local ENODEV=$(errno_name_to_val -ENODEV) + kmod_check_parallel_loads \ + "${FUNCNAME[0]}" 4 "sleep_msecs=1000 retval=$ENODEV" 3000 \ + 0 3 0 1 0 +} + list_tests() { echo "Test ID list:" @@ -525,6 +662,8 @@ list_tests() echo "0011 x $(get_test_count 0011) - test completely disabling module autoloading" echo "0012 x $(get_test_count 0012) - test /proc/modules address visibility under CAP_SYSLOG" echo "0013 x $(get_test_count 0013) - test /sys/module/*/sections/* visibility under CAP_SYSLOG" + echo "0014 x $(get_test_count 0014) - test handling of parallel loads, success case" + echo "0015 x $(get_test_count 0015) - test handling of parallel loads, init returning error" } usage() diff --git a/tools/testing/selftests/kmod/kmod_test_0014/Makefile b/tools/testing/selftests/kmod/kmod_test_0014/Makefile new file mode 100644 index 000000000000..c90afe5a98ad --- /dev/null +++ b/tools/testing/selftests/kmod/kmod_test_0014/Makefile @@ -0,0 +1,14 @@ +# SPDX-License-Identifier: GPL-2.0-only + +KMOD_TEST_0014_DIR := $(realpath $(dir $(abspath $(lastword $(MAKEFILE_LIST))))) +KDIR ?= $(abspath $(KMOD_TEST_0014_DIR)/../../../../..) + +MODULES = kmod_test_0014.ko + +obj-m += kmod_test_0014.o + +all: + $(MAKE) -C $(KDIR) M=$(KMOD_TEST_0014_DIR) modules + +clean: + $(MAKE) -C $(KDIR) M=$(KMOD_TEST_0014_DIR) clean diff --git a/tools/testing/selftests/kmod/kmod_test_0014/kmod_test_0014.c b/tools/testing/selftests/kmod/kmod_test_0014/kmod_test_0014.c new file mode 100644 index 000000000000..2b2ebeefe689 --- /dev/null +++ b/tools/testing/selftests/kmod/kmod_test_0014/kmod_test_0014.c @@ -0,0 +1,29 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include + +static unsigned int sleep_msecs = 1000; +module_param(sleep_msecs, uint, 0644); +MODULE_PARM_DESC(sleep_msecs, "init sleep in miliseconds, default 1000"); + +static int retval; +module_param(retval, int, 0644); +MODULE_PARM_DESC(retval, "init return value, default 0"); + +static int __init kmod_test_0014_init(void) +{ + msleep(sleep_msecs); + return retval; +} + +static void __exit kmod_test_0014_exit(void) +{ +} + +module_init(kmod_test_0014_init); +module_exit(kmod_test_0014_exit); + +MODULE_AUTHOR("Petr Pavlu "); +MODULE_LICENSE("GPL");