ovl: port to new mount api

We recently ported util-linux to the new mount api. Now the mount(8)
tool will by default use the new mount api. While trying hard to fall
back to the old mount api gracefully there are still cases where we run
into issues that are difficult to handle nicely.

Now with mount(8) and libmount supporting the new mount api I expect an
increase in the number of bug reports and issues we're going to see with
filesystems that don't yet support the new mount api. So it's time we
rectify this.

For overlayfs specifically we ran into issues where mount(8) passed
multiple lower layers as one big string through fsconfig(). But the
fsconfig() FSCONFIG_SET_STRING option is limited to 256 bytes in
strndup_user(). While this would be fixable by extending the fsconfig()
buffer I'd rather encourage users to append layers via multiple
fsconfig() calls as the interface allows nicely for this. This has also
been requested as a feature before.

With this port to the new mount api the following will be possible:

        fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", "/lower1", 0);

	/* set upper layer */
	fsconfig(fs_fd, FSCONFIG_SET_STRING, "upperdir", "/upper", 0);

	/* append "/lower2", "/lower3", and "/lower4" */
        fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", ":/lower2:/lower3:/lower4", 0);

	/* turn index feature on */
	fsconfig(fs_fd, FSCONFIG_SET_STRING, "index", "on", 0);

	/* append "/lower5" */
        fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", ":/lower5", 0);

Specifying ':' would have been rejected so this isn't a regression. And
we can't simply use "lowerdir=/lower" to append on top of existing
layers as "lowerdir=/lower,lowerdir=/other-lower" would make
"/other-lower" the only lower layer so we'd break uapi if we changed
this. So the ':' prefix seems a good compromise.

Users can choose to specify multiple layers at once or individual
layers. A layer is appended if it starts with ":". This requires that
the user has already added at least one layer before. If lowerdir is
specified again without a leading ":" then all previous layers are
dropped and replaced with the new layers. If lowerdir is specified and
empty than all layers are simply dropped.

An additional change is that overlayfs will now parse and resolve layers
right when they are specified in fsconfig() instead of deferring until
super block creation. This allows users to receive early errors.

It also allows users to actually use up to 500 layers something which
was theoretically possible but ended up not working due to the mount
option string passed via mount(2) being too large.

This also allows a more privileged process to set config options for a
lesser privileged process as the creds for fsconfig() and the creds for
fsopen() can differ. We could restrict that they match by enforcing that
the creds of fsopen() and fsconfig() match but I don't see why that
needs to be the case and allows for a good delegation mechanism.

Plus, in the future it means we're able to extend overlayfs mount
options and allow users to specify layers via file descriptors instead
of paths:

	fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower1", dirfd);

	/* append */
	fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower2", dirfd);

	/* append */
	fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower3", dirfd);

	/* clear all layers specified until now */
	fsconfig(FSCONFIG_SET_STRING, "lowerdir", NULL, 0);

This would be especially nice if users create an overlayfs mount on top
of idmapped layers or just in general private mounts created via
open_tree(OPEN_TREE_CLONE). Those mounts would then never have to appear
anywhere in the filesystem. But for now just do the minimal thing.

We should probably aim to move more validation into ovl_fs_parse_param()
so users get errors before fsconfig(FSCONFIG_CMD_CREATE). But that can
be done in additional patches later.

Link: https://github.com/util-linux/util-linux/issues/2287 [1]
Link: https://github.com/util-linux/util-linux/issues/1992 [2]
Link: https://bugs.archlinux.org/task/78702 [3]
Link: https://lore.kernel.org/linux-unionfs/20230530-klagen-zudem-32c0908c2108@brauner [4]
Signed-off-by: Christian Brauner <brauner@kernel.org>
---

---

I'm starting to get the feeling that I stared enough at this and I would
need a fresh set of eyes to review it for any bugs. Plus, Amir seems to
have conflicting series and I would have to rebase anyway so no point in
delaying this any further.
---
 fs/overlayfs/super.c | 896 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 568 insertions(+), 328 deletions(-)

---
base-commit: 9561de3a55bed6bdd44a12820ba81ec416e705a7
change-id: 20230605-fs-overlayfs-mount_api-20ea8b04eff4

Message ID	20230605-fs-overlayfs-mount_api-v1-1-a8d78c3fbeaf@kernel.org (mailing list archive)
State	Changes Requested
Headers	show Return-Path: <linux-fsdevel-owner@vger.kernel.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD2E2C7EE25 for <linux-fsdevel@archiver.kernel.org>; Thu, 8 Jun 2023 16:08:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232432AbjFHQIA (ORCPT <rfc822;linux-fsdevel@archiver.kernel.org>); Thu, 8 Jun 2023 12:08:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232417AbjFHQH7 (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>); Thu, 8 Jun 2023 12:07:59 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4713911A; Thu, 8 Jun 2023 09:07:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D15FF64EC7; Thu, 8 Jun 2023 16:07:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0D87AC433EF; Thu, 8 Jun 2023 16:07:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686240473; bh=zyvdT0lzmE/fAbsJSGwljB8RsNTQ0E8EQtJfaiAXeBc=; h=From:Date:Subject:To:Cc:From; b=lYhyBIR+6YnZH9CzKJmyYlZV9/uyyDUo024o5I5l3if2yJo+RJgErDnHmrDPzm6AL MNdb3Orh8Qhu/AtT4SSHKfghfBD6Nz+PjGrUEoqpdD34Z6zdD/VFRBpQVPFYicnGOu IcxywClHiZqSKJipcIW3BbefdMmDJZzUnJRw9fhioFU0F4ah2szCiSdA9sVFTVFI8L HAfG7QYjO2d+LTrCvflGboFJvrd/1I5eOj8tQE553RxyQqNz7Lss3i0EiarRvVJ7SQ SXLEy3m8dbM4fpQX23PRGzLFtsYCiNmwFHnHgtLpeHMgt+xrDRVz0mjgTNfijCz7m7 Fu85/gEdKKaJQ== From: Christian Brauner <brauner@kernel.org> Date: Thu, 08 Jun 2023 18:07:45 +0200 Subject: [PATCH] ovl: port to new mount api MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20230605-fs-overlayfs-mount_api-v1-1-a8d78c3fbeaf@kernel.org> X-B4-Tracking: v=1; b=H4sIAND8gWQC/x2NUQrCQAwFr1LybSCuVcSriEi2Zm3A7pakFqX07 m79e8NjmAVcTMXh0ixgMqtryRX2uwa6nvNTUB+VIVA40ImOmBzLLPbib11DeefpzqNiIOFzpFZ SaqHKkV0wGueu3/SBfRLbjtEk6edfvN7W9QdS/r6vgQAAAA== To: Amir Goldstein <amir73il@gmail.com>, Miklos Szeredi <miklos@szeredi.hu> Cc: linux-fsdevel@vger.kernel.org, linux-unionfs@vger.kernel.org, Christian Brauner <brauner@kernel.org> X-Mailer: b4 0.13-dev-c6835 X-Developer-Signature: v=1; a=openpgp-sha256; l=35830; i=brauner@kernel.org; h=from:subject:message-id; bh=zyvdT0lzmE/fAbsJSGwljB8RsNTQ0E8EQtJfaiAXeBc=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaQ0/rm+u+z6rotLipcs9mpIN9ptbLj57LOwJw90Lcxd5+5S bW3P7yhlYRDjYpAVU2RxaDcJl1vOU7HZKFMDZg4rE8gQBi5OAZjI6b+MDPeuzV/NMmXWHt3jUkfnNb 3NSDLuqxb5lZI9S9iO46+GLS8jw6rPz68LrqqLTfiiyPFgS0OLY+LPstt7f19qT3rhrcIxkQEA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Precedence: bulk List-ID: <linux-fsdevel.vger.kernel.org> X-Mailing-List: linux-fsdevel@vger.kernel.org
Series	ovl: port to new mount api \| expand ovl: port to new mount api

ovl: port to new mount api

Commit Message

Comments

Patch