From patchwork Tue Apr 23 13:53:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev\" via" X-Patchwork-Id: 10913155 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BEEB11390 for ; Tue, 23 Apr 2019 14:07:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AC1E0287F5 for ; Tue, 23 Apr 2019 14:07:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9EDD528821; Tue, 23 Apr 2019 14:07:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7712C287F5 for ; Tue, 23 Apr 2019 14:07:14 +0000 (UTC) Received: from localhost ([127.0.0.1]:54360 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIw4n-0000cC-Qy for patchwork-qemu-devel@patchwork.kernel.org; Tue, 23 Apr 2019 10:07:13 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40483) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIw2f-00074I-KE for qemu-devel@nongnu.org; Tue, 23 Apr 2019 10:05:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hIvrt-0006Rx-EQ for qemu-devel@nongnu.org; Tue, 23 Apr 2019 09:53:55 -0400 Received: from kylie.crudebyte.com ([5.189.157.229]:51281) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hIvrp-0006Oy-Kb for qemu-devel@nongnu.org; Tue, 23 Apr 2019 09:53:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=kylie; h=Content-Type:Content-Transfer-Encoding: MIME-Version:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=iZk1Vb6udMAkYg0Hz+i5kpaGvOcFOyyL87f7eNWGcoM=; b=Vfm41Ng8khJspMB84NOmV2Gza8 aTJ4JzLmaDaxnI6dMS/8JlJxZLc1azVutOf+qetZhMrKOng1JeUDAmxuzvy3SdrdcDNLuclibpb2y t6y3xT6oxAUTEyUjkxpgIeGCVbmtwIaHk9un76u+CXggmtdOH41cCyqCh7ZGzBcHOG2FgcOqrDPqS 3I1vXOT+NlRRVF+AXDKj03ODWk+EUG3lBSoGd+k+C5gwwLsttEpyVmAK27MknXwBxMTG/E4lsfXIF xXdrufTm59WpamobSIlS0hd927J866uVN1RinR7tQIT8NQuJcJ0VYhieRKDmt+dwN+RjHOgE/IYGG ON+21j67suy29A7HEDeWlzWmyogm87A/SFtrGzu21WuK/JBHs4aoOOrETXiHVLWNoe3cqGFeNgZ2l wET6g6zEEH8FFFn5AcdtMEf6BFJdiuAHZiW05VNX32ViQz/EBSWj7wryzBizDFOgvCiTY41JACgnW oRMYWwioQuorSruLmtOL3vSefTbdnq0GVtkMmlt2a5NNp1Qofo/jJGDuP6M8DpDqtTsECXzZG+czp W1WEkgKM2ixUlcRKNgbmoGIzBy/HCx01A5tOuyKSyB3gn4FyA21St4UFZtNTX3KW1E3FQd5AlxHLj gIjbXpKad4qVPwI2HFnFPGiy+FCxAH8aiCa+7kT0I=; To: qemu-devel@nongnu.org Date: Tue, 23 Apr 2019 15:53:44 +0200 Message-ID: <2441220.fK484kklmY@silver> MIME-Version: 1.0 X-Spam_score: -0.0 X-Spam_score_int: 0 X-Spam_bar: / X-Spam_report: Spam detection software, running on the system "kylie.crudebyte.com", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: I am attempting to revive Antonios Motakis' effort to fix the current file ID collisions with 9p. My patch set does not fix all the concerns you had with his original patch set, especially if there is [...] Content analysis details: (-0.0 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: crudebyte.com] -0.0 NO_RELAYS Informational: message was not relayed via SMTP -0.0 NO_RECEIVED Informational: message has no Received headers X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 5.189.157.229 Subject: [Qemu-devel] [PATCH 1/3] 9p: mitigates most QID path collisions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christian Schoenebeck via Qemu-devel From: "Denis V. Lunev\" via" Reply-To: Christian Schoenebeck Cc: Greg Kurz , Antonios Motakis Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP I am attempting to revive Antonios Motakis' effort to fix the current file ID collisions with 9p. My patch set does not fix all the concerns you had with his original patch set, especially if there is still some (very rare) case of QID path collision the affected files are still visible on guest. However I hope we can bring this overall issue forward, because with the current 9p implementation it is almost inevitable to end up with QID path collisions, which in turn causes very severe misbehaviours like data corruption and data loss on guest. On the other hand the yet unresolved issues that you noted are IMO rather theoretical issues. Because in practice you usually will have around not more than one or two dozens of entries in qpp_table (and not a single entry in qpf_table). So this first patch here is an updated version of Antonios Motakis' original 4-patch set, merged to one patch: https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg02283.html * Updated to latest git master, specifically to new qht interface. * Merged the original 4 patches to this single patch. Signed-off-by: Christian Schoenebeck --- fsdev/9p-marshal.h | 4 +- hw/9pfs/9p.c | 200 ++++++++++++++++++++++++++++++++++++++++++++++++----- hw/9pfs/9p.h | 21 ++++++ 3 files changed, 204 insertions(+), 21 deletions(-) diff --git a/fsdev/9p-marshal.h b/fsdev/9p-marshal.h index c8823d878f..d1ad3645c4 100644 --- a/fsdev/9p-marshal.h +++ b/fsdev/9p-marshal.h @@ -10,8 +10,8 @@ typedef struct V9fsString typedef struct V9fsQID { int8_t type; - int32_t version; - int64_t path; + uint32_t version; + uint64_t path; } V9fsQID; typedef struct V9fsStat diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 55821343e5..b9bbdcbaee 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -25,6 +25,7 @@ #include "trace.h" #include "migration/blocker.h" #include "sysemu/qtest.h" +#include "qemu/xxhash.h" int open_fd_hw; int total_open_fd; @@ -571,14 +572,135 @@ static void coroutine_fn virtfs_reset(V9fsPDU *pdu) P9_STAT_MODE_NAMED_PIPE | \ P9_STAT_MODE_SOCKET) -/* This is the algorithm from ufs in spfs */ -static void stat_to_qid(const struct stat *stbuf, V9fsQID *qidp) + +/* creative abuse of qemu_xxhash7, which is based on xxhash */ +static uint32_t qpp_hash(QppEntry e) { - size_t size; + return qemu_xxhash7(e.ino_prefix, e.dev, 0, 0, 0); +} + +static uint32_t qpf_hash(QpfEntry e) +{ + return qemu_xxhash7(e.ino, e.dev, 0, 0, 0); +} + +static bool qpp_cmp_func(const void *obj, const void *userp) +{ + const QppEntry *e1 = obj, *e2 = userp; + return (e1->dev == e2->dev) && (e1->ino_prefix == e2->ino_prefix); +} + +static bool qpf_cmp_func(const void *obj, const void *userp) +{ + const QpfEntry *e1 = obj, *e2 = userp; + return (e1->dev == e2->dev) && (e1->ino == e2->ino); +} + +static void qp_table_remove(void *p, uint32_t h, void *up) +{ + g_free(p); +} + +static void qp_table_destroy(struct qht *ht) +{ + qht_iter(ht, qp_table_remove, NULL); + qht_destroy(ht); +} + +static int qid_path_fullmap(V9fsPDU *pdu, const struct stat *stbuf, + uint64_t *path) +{ + QpfEntry lookup = { + .dev = stbuf->st_dev, + .ino = stbuf->st_ino + }, *val; + uint32_t hash = qpf_hash(lookup); + + /* most users won't need the fullmap, so init the table lazily */ + if (!pdu->s->qpf_table.map) { + qht_init(&pdu->s->qpf_table, qpf_cmp_func, 1 << 16, QHT_MODE_AUTO_RESIZE); + } + + val = qht_lookup(&pdu->s->qpf_table, &lookup, hash); + + if (!val) { + if (pdu->s->qp_fullpath_next == 0) { + /* no more files can be mapped :'( */ + return -ENFILE; + } + + val = g_malloc0(sizeof(QppEntry)); + if (!val) { + return -ENOMEM; + } + *val = lookup; + + /* new unique inode and device combo */ + val->path = pdu->s->qp_fullpath_next++; + pdu->s->qp_fullpath_next &= QPATH_INO_MASK; + qht_insert(&pdu->s->qpf_table, val, hash, NULL); + } + + *path = val->path; + return 0; +} + +/* stat_to_qid needs to map inode number (64 bits) and device id (32 bits) + * to a unique QID path (64 bits). To avoid having to map and keep track + * of up to 2^64 objects, we map only the 16 highest bits of the inode plus + * the device id to the 16 highest bits of the QID path. The 48 lowest bits + * of the QID path equal to the lowest bits of the inode number. + * + * This takes advantage of the fact that inode number are usually not + * random but allocated sequentially, so we have fewer items to keep + * track of. + */ +static int qid_path_prefixmap(V9fsPDU *pdu, const struct stat *stbuf, + uint64_t *path) +{ + QppEntry lookup = { + .dev = stbuf->st_dev, + .ino_prefix = (uint16_t) (stbuf->st_ino >> 48) + }, *val; + uint32_t hash = qpp_hash(lookup); + + val = qht_lookup(&pdu->s->qpp_table, &lookup, hash); + + if (!val) { + if (pdu->s->qp_prefix_next == 0) { + /* we ran out of prefixes */ + return -ENFILE; + } + + val = g_malloc0(sizeof(QppEntry)); + if (!val) { + return -ENOMEM; + } + *val = lookup; + + /* new unique inode prefix and device combo */ + val->qp_prefix = pdu->s->qp_prefix_next++; + qht_insert(&pdu->s->qpp_table, val, hash, NULL); + } + + *path = ((uint64_t)val->qp_prefix << 48) | (stbuf->st_ino & QPATH_INO_MASK); + return 0; +} + +static int stat_to_qid(V9fsPDU *pdu, const struct stat *stbuf, V9fsQID *qidp) +{ + int err; + + /* map inode+device to qid path (fast path) */ + err = qid_path_prefixmap(pdu, stbuf, &qidp->path); + if (err == -ENFILE) { + /* fast path didn't work, fal back to full map */ + err = qid_path_fullmap(pdu, stbuf, &qidp->path); + } + if (err) { + return err; + } - memset(&qidp->path, 0, sizeof(qidp->path)); - size = MIN(sizeof(stbuf->st_ino), sizeof(qidp->path)); - memcpy(&qidp->path, &stbuf->st_ino, size); qidp->version = stbuf->st_mtime ^ (stbuf->st_size << 8); qidp->type = 0; if (S_ISDIR(stbuf->st_mode)) { @@ -587,6 +709,8 @@ static void stat_to_qid(const struct stat *stbuf, V9fsQID *qidp) if (S_ISLNK(stbuf->st_mode)) { qidp->type |= P9_QID_TYPE_SYMLINK; } + + return 0; } static int coroutine_fn fid_to_qid(V9fsPDU *pdu, V9fsFidState *fidp, @@ -599,7 +723,10 @@ static int coroutine_fn fid_to_qid(V9fsPDU *pdu, V9fsFidState *fidp, if (err < 0) { return err; } - stat_to_qid(&stbuf, qidp); + err = stat_to_qid(pdu, &stbuf, qidp); + if (err < 0) { + return err; + } return 0; } @@ -830,7 +957,10 @@ static int coroutine_fn stat_to_v9stat(V9fsPDU *pdu, V9fsPath *path, memset(v9stat, 0, sizeof(*v9stat)); - stat_to_qid(stbuf, &v9stat->qid); + err = stat_to_qid(pdu, stbuf, &v9stat->qid); + if (err < 0) { + return err; + } v9stat->mode = stat_to_v9mode(stbuf); v9stat->atime = stbuf->st_atime; v9stat->mtime = stbuf->st_mtime; @@ -891,7 +1021,7 @@ static int coroutine_fn stat_to_v9stat(V9fsPDU *pdu, V9fsPath *path, #define P9_STATS_ALL 0x00003fffULL /* Mask for All fields above */ -static void stat_to_v9stat_dotl(V9fsState *s, const struct stat *stbuf, +static int stat_to_v9stat_dotl(V9fsPDU *pdu, const struct stat *stbuf, V9fsStatDotl *v9lstat) { memset(v9lstat, 0, sizeof(*v9lstat)); @@ -913,7 +1043,7 @@ static void stat_to_v9stat_dotl(V9fsState *s, const struct stat *stbuf, /* Currently we only support BASIC fields in stat */ v9lstat->st_result_mask = P9_STATS_BASIC; - stat_to_qid(stbuf, &v9lstat->qid); + return stat_to_qid(pdu, stbuf, &v9lstat->qid); } static void print_sg(struct iovec *sg, int cnt) @@ -1115,7 +1245,6 @@ static void coroutine_fn v9fs_getattr(void *opaque) uint64_t request_mask; V9fsStatDotl v9stat_dotl; V9fsPDU *pdu = opaque; - V9fsState *s = pdu->s; retval = pdu_unmarshal(pdu, offset, "dq", &fid, &request_mask); if (retval < 0) { @@ -1136,7 +1265,10 @@ static void coroutine_fn v9fs_getattr(void *opaque) if (retval < 0) { goto out; } - stat_to_v9stat_dotl(s, &stbuf, &v9stat_dotl); + retval = stat_to_v9stat_dotl(pdu, &stbuf, &v9stat_dotl); + if (retval < 0) { + goto out; + } /* fill st_gen if requested and supported by underlying fs */ if (request_mask & P9_STATS_GEN) { @@ -1381,7 +1513,10 @@ static void coroutine_fn v9fs_walk(void *opaque) if (err < 0) { goto out; } - stat_to_qid(&stbuf, &qid); + err = stat_to_qid(pdu, &stbuf, &qid); + if (err < 0) { + goto out; + } v9fs_path_copy(&dpath, &path); } memcpy(&qids[name_idx], &qid, sizeof(qid)); @@ -1483,7 +1618,10 @@ static void coroutine_fn v9fs_open(void *opaque) if (err < 0) { goto out; } - stat_to_qid(&stbuf, &qid); + err = stat_to_qid(pdu, &stbuf, &qid); + if (err < 0) { + goto out; + } if (S_ISDIR(stbuf.st_mode)) { err = v9fs_co_opendir(pdu, fidp); if (err < 0) { @@ -1593,7 +1731,10 @@ static void coroutine_fn v9fs_lcreate(void *opaque) fidp->flags |= FID_NON_RECLAIMABLE; } iounit = get_iounit(pdu, &fidp->path); - stat_to_qid(&stbuf, &qid); + err = stat_to_qid(pdu, &stbuf, &qid); + if (err < 0) { + goto out; + } err = pdu_marshal(pdu, offset, "Qd", &qid, iounit); if (err < 0) { goto out; @@ -2327,7 +2468,10 @@ static void coroutine_fn v9fs_create(void *opaque) } } iounit = get_iounit(pdu, &fidp->path); - stat_to_qid(&stbuf, &qid); + err = stat_to_qid(pdu, &stbuf, &qid); + if (err < 0) { + goto out; + } err = pdu_marshal(pdu, offset, "Qd", &qid, iounit); if (err < 0) { goto out; @@ -2384,7 +2528,10 @@ static void coroutine_fn v9fs_symlink(void *opaque) if (err < 0) { goto out; } - stat_to_qid(&stbuf, &qid); + err = stat_to_qid(pdu, &stbuf, &qid); + if (err < 0) { + goto out; + } err = pdu_marshal(pdu, offset, "Q", &qid); if (err < 0) { goto out; @@ -3064,7 +3211,10 @@ static void coroutine_fn v9fs_mknod(void *opaque) if (err < 0) { goto out; } - stat_to_qid(&stbuf, &qid); + err = stat_to_qid(pdu, &stbuf, &qid); + if (err < 0) { + goto out; + } err = pdu_marshal(pdu, offset, "Q", &qid); if (err < 0) { goto out; @@ -3222,7 +3372,10 @@ static void coroutine_fn v9fs_mkdir(void *opaque) if (err < 0) { goto out; } - stat_to_qid(&stbuf, &qid); + err = stat_to_qid(pdu, &stbuf, &qid); + if (err < 0) { + goto out; + } err = pdu_marshal(pdu, offset, "Q", &qid); if (err < 0) { goto out; @@ -3633,6 +3786,11 @@ int v9fs_device_realize_common(V9fsState *s, const V9fsTransport *t, goto out; } + /* QID path hash table. 1 entry ought to be enough for anybody ;) */ + qht_init(&s->qpp_table, qpp_cmp_func, 1, QHT_MODE_AUTO_RESIZE); + s->qp_prefix_next = 1; /* reserve 0 to detect overflow */ + s->qp_fullpath_next = 1; + s->ctx.fst = &fse->fst; fsdev_throttle_init(s->ctx.fst); @@ -3646,6 +3804,8 @@ out: } g_free(s->tag); g_free(s->ctx.fs_root); + qp_table_destroy(&s->qpp_table); + qp_table_destroy(&s->qpf_table); v9fs_path_free(&path); } return rc; @@ -3658,6 +3818,8 @@ void v9fs_device_unrealize_common(V9fsState *s, Error **errp) } fsdev_throttle_cleanup(s->ctx.fst); g_free(s->tag); + qp_table_destroy(&s->qpp_table); + qp_table_destroy(&s->qpf_table); g_free(s->ctx.fs_root); } diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h index 8883761b2c..44112ea97f 100644 --- a/hw/9pfs/9p.h +++ b/hw/9pfs/9p.h @@ -8,6 +8,7 @@ #include "fsdev/9p-iov-marshal.h" #include "qemu/thread.h" #include "qemu/coroutine.h" +#include "qemu/qht.h" enum { P9_TLERROR = 6, @@ -235,6 +236,22 @@ struct V9fsFidState V9fsFidState *rclm_lst; }; +#define QPATH_INO_MASK (((unsigned long)1 << 48) - 1) + +/* QID path prefix entry, see stat_to_qid */ +typedef struct { + dev_t dev; + uint16_t ino_prefix; + uint16_t qp_prefix; +} QppEntry; + +/* QID path full entry, as above */ +typedef struct { + dev_t dev; + ino_t ino; + uint64_t path; +} QpfEntry; + struct V9fsState { QLIST_HEAD(, V9fsPDU) free_list; @@ -256,6 +273,10 @@ struct V9fsState Error *migration_blocker; V9fsConf fsconf; V9fsQID root_qid; + struct qht qpp_table; + struct qht qpf_table; + uint16_t qp_prefix_next; + uint64_t qp_fullpath_next; }; /* 9p2000.L open flags */ From patchwork Tue Apr 23 13:54:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev\" via" X-Patchwork-Id: 10913159 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2A9D7112C for ; Tue, 23 Apr 2019 14:08:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 19A43287F5 for ; Tue, 23 Apr 2019 14:08:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BD5F2882D; Tue, 23 Apr 2019 14:08:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3A99F287F5 for ; Tue, 23 Apr 2019 14:08:25 +0000 (UTC) Received: from localhost ([127.0.0.1]:54371 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIw5w-0001U5-FK for patchwork-qemu-devel@patchwork.kernel.org; Tue, 23 Apr 2019 10:08:24 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40483) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIw2b-00074I-Sr for qemu-devel@nongnu.org; Tue, 23 Apr 2019 10:04:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hIvsW-0006nX-5d for qemu-devel@nongnu.org; Tue, 23 Apr 2019 09:54:33 -0400 Received: from kylie.crudebyte.com ([5.189.157.229]:58379) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hIvsV-0006ll-So for qemu-devel@nongnu.org; Tue, 23 Apr 2019 09:54:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=kylie; h=Content-Type:Content-Transfer-Encoding: MIME-Version:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=f8OvdNhJQF3kmYWz248I2bzfrKUrfp0ofgOEVjyXQ6o=; b=Ux2ZUmPqDfh5RFioi/WZGsHuoF YeHDw8bCLvoYsGvCjrIJMZfvmD7Go4IW7kKs943ARoXyNyh7DQpTOk0BULY3NIC5GnzsYfZjixcO8 8I/yd67pgP9zBh0OnwyROx91JGSu4n82kJVT6y+CmHffon31zt7EDPQsZk9D0MaEbFG0vu/DSwK+m LqedE/25dGp4XEqFq5xKMCj729HdZHhHw0lV76PM0kypC7FNFGg+95mEc0puE/CJUV4xF2vQhCJa3 SzhNMOUpP/nCEa6Hpbh6ciExftwDpAGKDimeTvls+KNnLKjX56uke/ShQ0cr1MYTDTdkgk+PWrII3 IhbjRuV1MNi9Jc1mFy5zD9i6DU6STcY/w8BO57JxhkBkIPvM8xtKMtHFFdcXYYyySYkWrEU8YhnVE lh81WuhttN4C+1Rd+nX6PYzSxp/b0+h2kfDSoGuJOtNUoG+R7hgaWPRUErdyIbYDUNHHiCS4bnyjQ Sta+ZJ9DAqY2VAjguLLtloomUGNuAgCmwqSBnQmKEjh6Qenv1lmIy3RIT7kfW/7QP59qPxrzDt/ZJ iV8JSooGXB8cwYYv+F5PLnc/TmYnqGZ4VwXZTU0sgUXBijGEGShI3PNt7p6EN25/qVr4NG0cTBK2c tVj4QzpEQoIwvKrCnS/pL3fYbvPSPbsbfSIlnvOeQ=; To: qemu-devel@nongnu.org Date: Tue, 23 Apr 2019 15:54:30 +0200 Message-ID: <1947340.ayO2TLl7lz@silver> MIME-Version: 1.0 X-Spam_score: -0.0 X-Spam_score_int: 0 X-Spam_bar: / X-Spam_report: Spam detection software, running on the system "kylie.crudebyte.com", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Addresses trivial changes regarding the previous patch as requested on the mailing list a while ago. * Removed unneccessary parantheses: https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg02661.html Content analysis details: (-0.0 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnu.org] -0.0 NO_RELAYS Informational: message was not relayed via SMTP -0.0 NO_RECEIVED Informational: message has no Received headers X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 5.189.157.229 Subject: [Qemu-devel] [PATCH 2/3] 9P: trivial cleanup of QID path collision mitigation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christian Schoenebeck via Qemu-devel From: "Denis V. Lunev\" via" Reply-To: Christian Schoenebeck Cc: Greg Kurz , Antonios Motakis Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Addresses trivial changes regarding the previous patch as requested on the mailing list a while ago. * Removed unneccessary parantheses: https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg02661.html * Removed unneccessary g_malloc() result checks: https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg02814.html * Unsigned type changes: https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg02581.html Signed-off-by: Christian Schoenebeck --- fsdev/9p-marshal.h | 2 +- hw/9pfs/9p.c | 16 +++++----------- hw/9pfs/trace-events | 14 +++++++------- 3 files changed, 13 insertions(+), 19 deletions(-) diff --git a/fsdev/9p-marshal.h b/fsdev/9p-marshal.h index d1ad3645c4..8f3babb60a 100644 --- a/fsdev/9p-marshal.h +++ b/fsdev/9p-marshal.h @@ -9,7 +9,7 @@ typedef struct V9fsString typedef struct V9fsQID { - int8_t type; + uint8_t type; uint32_t version; uint64_t path; } V9fsQID; diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index b9bbdcbaee..2b893e25a1 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -587,13 +587,13 @@ static uint32_t qpf_hash(QpfEntry e) static bool qpp_cmp_func(const void *obj, const void *userp) { const QppEntry *e1 = obj, *e2 = userp; - return (e1->dev == e2->dev) && (e1->ino_prefix == e2->ino_prefix); + return e1->dev == e2->dev && e1->ino_prefix == e2->ino_prefix; } static bool qpf_cmp_func(const void *obj, const void *userp) { const QpfEntry *e1 = obj, *e2 = userp; - return (e1->dev == e2->dev) && (e1->ino == e2->ino); + return e1->dev == e2->dev && e1->ino == e2->ino; } static void qp_table_remove(void *p, uint32_t h, void *up) @@ -630,9 +630,6 @@ static int qid_path_fullmap(V9fsPDU *pdu, const struct stat *stbuf, } val = g_malloc0(sizeof(QppEntry)); - if (!val) { - return -ENOMEM; - } *val = lookup; /* new unique inode and device combo */ @@ -673,9 +670,6 @@ static int qid_path_prefixmap(V9fsPDU *pdu, const struct stat *stbuf, } val = g_malloc0(sizeof(QppEntry)); - if (!val) { - return -ENOMEM; - } *val = lookup; /* new unique inode prefix and device combo */ @@ -870,9 +864,9 @@ static int donttouch_stat(V9fsStat *stat) { if (stat->type == -1 && stat->dev == -1 && - stat->qid.type == -1 && - stat->qid.version == -1 && - stat->qid.path == -1 && + stat->qid.type == 0xff && + stat->qid.version == (uint32_t) -1 && + stat->qid.path == (uint64_t) -1 && stat->mode == -1 && stat->atime == -1 && stat->mtime == -1 && diff --git a/hw/9pfs/trace-events b/hw/9pfs/trace-events index c0a0a4ab5d..6964756922 100644 --- a/hw/9pfs/trace-events +++ b/hw/9pfs/trace-events @@ -6,7 +6,7 @@ v9fs_rerror(uint16_t tag, uint8_t id, int err) "tag %d id %d err %d" v9fs_version(uint16_t tag, uint8_t id, int32_t msize, char* version) "tag %d id %d msize %d version %s" v9fs_version_return(uint16_t tag, uint8_t id, int32_t msize, char* version) "tag %d id %d msize %d version %s" v9fs_attach(uint16_t tag, uint8_t id, int32_t fid, int32_t afid, char* uname, char* aname) "tag %u id %u fid %d afid %d uname %s aname %s" -v9fs_attach_return(uint16_t tag, uint8_t id, int8_t type, int32_t version, int64_t path) "tag %d id %d type %d version %d path %"PRId64 +v9fs_attach_return(uint16_t tag, uint8_t id, uint8_t type, uint32_t version, uint64_t path) "tag %d id %d type %d version %d path %"PRId64 v9fs_stat(uint16_t tag, uint8_t id, int32_t fid) "tag %d id %d fid %d" v9fs_stat_return(uint16_t tag, uint8_t id, int32_t mode, int32_t atime, int32_t mtime, int64_t length) "tag %d id %d stat={mode %d atime %d mtime %d length %"PRId64"}" v9fs_getattr(uint16_t tag, uint8_t id, int32_t fid, uint64_t request_mask) "tag %d id %d fid %d request_mask %"PRIu64 @@ -14,9 +14,9 @@ v9fs_getattr_return(uint16_t tag, uint8_t id, uint64_t result_mask, uint32_t mod v9fs_walk(uint16_t tag, uint8_t id, int32_t fid, int32_t newfid, uint16_t nwnames) "tag %d id %d fid %d newfid %d nwnames %d" v9fs_walk_return(uint16_t tag, uint8_t id, uint16_t nwnames, void* qids) "tag %d id %d nwnames %d qids %p" v9fs_open(uint16_t tag, uint8_t id, int32_t fid, int32_t mode) "tag %d id %d fid %d mode %d" -v9fs_open_return(uint16_t tag, uint8_t id, int8_t type, int32_t version, int64_t path, int iounit) "tag %d id %d qid={type %d version %d path %"PRId64"} iounit %d" +v9fs_open_return(uint16_t tag, uint8_t id, uint8_t type, uint32_t version, uint64_t path, int iounit) "tag %d id %d qid={type %d version %d path %"PRId64"} iounit %d" v9fs_lcreate(uint16_t tag, uint8_t id, int32_t dfid, int32_t flags, int32_t mode, uint32_t gid) "tag %d id %d dfid %d flags %d mode %d gid %u" -v9fs_lcreate_return(uint16_t tag, uint8_t id, int8_t type, int32_t version, int64_t path, int32_t iounit) "tag %d id %d qid={type %d version %d path %"PRId64"} iounit %d" +v9fs_lcreate_return(uint16_t tag, uint8_t id, uint8_t type, uint32_t version, uint64_t path, int32_t iounit) "tag %d id %d qid={type %d version %d path %"PRId64"} iounit %d" v9fs_fsync(uint16_t tag, uint8_t id, int32_t fid, int datasync) "tag %d id %d fid %d datasync %d" v9fs_clunk(uint16_t tag, uint8_t id, int32_t fid) "tag %d id %d fid %d" v9fs_read(uint16_t tag, uint8_t id, int32_t fid, uint64_t off, uint32_t max_count) "tag %d id %d fid %d off %"PRIu64" max_count %u" @@ -26,21 +26,21 @@ v9fs_readdir_return(uint16_t tag, uint8_t id, uint32_t count, ssize_t retval) "t v9fs_write(uint16_t tag, uint8_t id, int32_t fid, uint64_t off, uint32_t count, int cnt) "tag %d id %d fid %d off %"PRIu64" count %u cnt %d" v9fs_write_return(uint16_t tag, uint8_t id, int32_t total, ssize_t err) "tag %d id %d total %d err %zd" v9fs_create(uint16_t tag, uint8_t id, int32_t fid, char* name, int32_t perm, int8_t mode) "tag %d id %d fid %d name %s perm %d mode %d" -v9fs_create_return(uint16_t tag, uint8_t id, int8_t type, int32_t version, int64_t path, int iounit) "tag %d id %d qid={type %d version %d path %"PRId64"} iounit %d" +v9fs_create_return(uint16_t tag, uint8_t id, uint8_t type, uint32_t version, uint64_t path, int iounit) "tag %d id %d qid={type %d version %d path %"PRId64"} iounit %d" v9fs_symlink(uint16_t tag, uint8_t id, int32_t fid, char* name, char* symname, uint32_t gid) "tag %d id %d fid %d name %s symname %s gid %u" -v9fs_symlink_return(uint16_t tag, uint8_t id, int8_t type, int32_t version, int64_t path) "tag %d id %d qid={type %d version %d path %"PRId64"}" +v9fs_symlink_return(uint16_t tag, uint8_t id, uint8_t type, uint32_t version, uint64_t path) "tag %d id %d qid={type %d version %d path %"PRId64"}" v9fs_flush(uint16_t tag, uint8_t id, int16_t flush_tag) "tag %d id %d flush_tag %d" v9fs_link(uint16_t tag, uint8_t id, int32_t dfid, int32_t oldfid, char* name) "tag %d id %d dfid %d oldfid %d name %s" v9fs_remove(uint16_t tag, uint8_t id, int32_t fid) "tag %d id %d fid %d" v9fs_wstat(uint16_t tag, uint8_t id, int32_t fid, int32_t mode, int32_t atime, int32_t mtime) "tag %u id %u fid %d stat={mode %d atime %d mtime %d}" v9fs_mknod(uint16_t tag, uint8_t id, int32_t fid, int mode, int major, int minor) "tag %d id %d fid %d mode %d major %d minor %d" -v9fs_mknod_return(uint16_t tag, uint8_t id, int8_t type, int32_t version, int64_t path) "tag %d id %d qid={type %d version %d path %"PRId64"}" +v9fs_mknod_return(uint16_t tag, uint8_t id, uint8_t type, uint32_t version, uint64_t path) "tag %d id %d qid={type %d version %d path %"PRId64"}" v9fs_lock(uint16_t tag, uint8_t id, int32_t fid, uint8_t type, uint64_t start, uint64_t length) "tag %d id %d fid %d type %d start %"PRIu64" length %"PRIu64 v9fs_lock_return(uint16_t tag, uint8_t id, int8_t status) "tag %d id %d status %d" v9fs_getlock(uint16_t tag, uint8_t id, int32_t fid, uint8_t type, uint64_t start, uint64_t length)"tag %d id %d fid %d type %d start %"PRIu64" length %"PRIu64 v9fs_getlock_return(uint16_t tag, uint8_t id, uint8_t type, uint64_t start, uint64_t length, uint32_t proc_id) "tag %d id %d type %d start %"PRIu64" length %"PRIu64" proc_id %u" v9fs_mkdir(uint16_t tag, uint8_t id, int32_t fid, char* name, int mode, uint32_t gid) "tag %u id %u fid %d name %s mode %d gid %u" -v9fs_mkdir_return(uint16_t tag, uint8_t id, int8_t type, int32_t version, int64_t path, int err) "tag %u id %u qid={type %d version %d path %"PRId64"} err %d" +v9fs_mkdir_return(uint16_t tag, uint8_t id, uint8_t type, uint32_t version, uint64_t path, int err) "tag %u id %u qid={type %d version %d path %"PRId64"} err %d" v9fs_xattrwalk(uint16_t tag, uint8_t id, int32_t fid, int32_t newfid, char* name) "tag %d id %d fid %d newfid %d name %s" v9fs_xattrwalk_return(uint16_t tag, uint8_t id, int64_t size) "tag %d id %d size %"PRId64 v9fs_xattrcreate(uint16_t tag, uint8_t id, int32_t fid, char* name, uint64_t size, int flags) "tag %d id %d fid %d name %s size %"PRIu64" flags %d" From patchwork Tue Apr 23 13:56:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Denis V. Lunev\" via" X-Patchwork-Id: 10913161 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2B559112C for ; Tue, 23 Apr 2019 14:08:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1459128821 for ; Tue, 23 Apr 2019 14:08:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 07B0928845; Tue, 23 Apr 2019 14:08:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E7B4E28821 for ; Tue, 23 Apr 2019 14:08:35 +0000 (UTC) Received: from localhost ([127.0.0.1]:54375 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIw67-0001cH-70 for patchwork-qemu-devel@patchwork.kernel.org; Tue, 23 Apr 2019 10:08:35 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40485) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIw2W-00074J-6j for qemu-devel@nongnu.org; Tue, 23 Apr 2019 10:04:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hIvuW-0000Tk-1Y for qemu-devel@nongnu.org; Tue, 23 Apr 2019 09:56:37 -0400 Received: from kylie.crudebyte.com ([5.189.157.229]:59567) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hIvuV-0000R3-Jt for qemu-devel@nongnu.org; Tue, 23 Apr 2019 09:56:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=crudebyte.com; s=kylie; h=Content-Type:Content-Transfer-Encoding: MIME-Version:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=jmzD717GLRuwmyta9m8n6QZG8VJl/tG9CHlxJQmsXmY=; b=hgoFmFfpp2a4lC6VCjlxpah3Yk sgTydufPxwxTcujlJClMlLtachiurdYPt6FM/wrwoy5JBrHqejHyj11yVV3x6QckSE/0hO6mGoCHz nWXPIT0iRpwFzh4xNITZBTikBUoG2095dvlFyai4C+125NCcewC+Fi4yXMQYBucuvAi3GaUAHby9X r0HkSYe1/5Yl5dQRxhwH1fTVt/sf5EfrT1DBaFoEs41SQIVDussXwg+JMGYVKOkN14v5oW1y1YL4L Qcxgugr7nVbjmTXzWGodDZFkrN5tAkFZ/SimwX/C6B+DkluqOGcXb8c8WOYC4ygqcqJSj4lsjmoKF G6KUJEUptVb4QeovN4wybnvYq/c/ZR/XD1J/45VPonNEO2MQ+8Hs+PaG2YfH/zAxqQEhCOiEOKVZs 8bzPi4ziVUCCW2XBmaCAkBwVkr9GR4rOelNQE0/q4PNB4Ps4wkmrWGWMFwAgFyH/OBEPiFgeOLtiK BsYaCVMtuLWAReAlPtF2SIuo+s5ohh+0mhxM2qLz+GO25lHNQnJ5Q3Dxt6aHok/+tDhAnoXXDK88X glW8h+yx+bPkJSQnGCn5o4xbBaK8GxVUWj9DBFLyZQH6mRMYPcYLG2undKralLAxyeviWIOw39FU5 js61Y4XrBFsZESp7FXo1R7spMx8ME329y4BEtQRcA=; To: qemu-devel@nongnu.org Date: Tue, 23 Apr 2019 15:56:33 +0200 Message-ID: <2736976.mc3WPJycmO@silver> MIME-Version: 1.0 X-Spam_score: -0.0 X-Spam_score_int: 0 X-Spam_bar: / X-Spam_report: Spam detection software, running on the system "kylie.crudebyte.com", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: This patch aims to keep QID path identical beyond the scope of reboots and guest suspensions. With the 1st patch alone the QID path of the same files might change after reboots / suspensions, since 9p [...] Content analysis details: (-0.0 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: crudebyte.com] -0.0 NO_RELAYS Informational: message was not relayed via SMTP -0.0 NO_RECEIVED Informational: message has no Received headers X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 5.189.157.229 Subject: [Qemu-devel] [PATCH 3/3] 9p: persistency of QID path beyond reboots / suspensions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christian Schoenebeck via Qemu-devel From: "Denis V. Lunev\" via" Reply-To: Christian Schoenebeck Cc: Greg Kurz , Antonios Motakis Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This patch aims to keep QID path identical beyond the scope of reboots and guest suspensions. With the 1st patch alone the QID path of the same files might change after reboots / suspensions, since 9p would restart with empty qpp_table and the resulting QID path depends on the precise sequence of files being accessed on guest. The first patch should already avoid the vast majority of potential QID path collisions. However especially network services running on guest would still be prone to QID path issues when just using the 1st patch. For instance Samba is exporting file IDs to clients in the network and SMB cliens in the network will use those IDs to access and request changes on the file server. If guest is now interrupted in between, like it commonly happens on maintenance, e.g. software updates on host, then SMB clients in the network will continue working with old file IDs, which in turn leads to data corruption and data loss on the file server. Furthermore on SMB client side I also encountered severe misbehaviours in this case, for instance Macs accessing the file server would either start to hang or die with a kernel panic within seconds, since the smbx implementation on macOS heavily relies on file IDs being unique (within the context of a connection that is). So this patch here mitigates the remaining problem described above by storing the qpp_table persistently as extended attribute(s) on the exported root of the file system and automatically tries to restore the qpp_table i.e. after reboots / resumptions. This patch is aimed at real world scenarios, in which qpp_table will only ever get few dozens of entries (and none ever in qpf_table). So it is e.g. intentionally limited to only store qpp_table, not qpf_table; and so far I have not made optimizations, since in practice the qpf_table is really just tiny. Since there is currently no callback in qemu yet that would reliably be called on guest shutdowns, the table is stored on every new insertion for now. Signed-off-by: Christian Schoenebeck --- hw/9pfs/9p.c | 315 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- hw/9pfs/9p.h | 33 +++++++ 2 files changed, 343 insertions(+), 5 deletions(-) diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 2b893e25a1..29c6dfc68a 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -26,6 +26,19 @@ #include "migration/blocker.h" #include "sysemu/qtest.h" #include "qemu/xxhash.h" +#include "qemu/crc32c.h" +#if defined(__linux__) /* TODO: This should probably go into osdep.h instead */ +# include /* for XATTR_SIZE_MAX */ +#endif + +/* + * How many bytes may we store to fs per extended attribute value? + */ +#ifdef XATTR_SIZE_MAX +# define ATTR_MAX_SIZE XATTR_SIZE_MAX /* Linux only: 64kB limit in kernel */ +#else +# define ATTR_MAX_SIZE 65536 /* Most systems allow a bit more, so we take this as basis. */ +#endif int open_fd_hw; int total_open_fd; @@ -642,6 +655,285 @@ static int qid_path_fullmap(V9fsPDU *pdu, const struct stat *stbuf, return 0; } +static inline bool is_ro_export(FsContext *ctx) +{ + return ctx->export_flags & V9FS_RDONLY; +} + +/* + * Once qpp_table size exceeds this value, we no longer save + * the table persistently. See comment in v9fs_store_qpp_table() + */ +#define QPP_TABLE_PERSISTENCY_LIMIT 32768 + +/* Remove all user.virtfs.system.qidp.* xattrs from export root. */ +static void remove_qidp_xattr(FsContext *ctx) +{ + V9fsString name; + int i; + + /* just for a paranoid endless recursion sanity check */ + const ssize_t max_size = + sizeof(QppSrlzHeader) + + QPP_TABLE_PERSISTENCY_LIMIT * sizeof(QppEntryS); + + v9fs_string_init(&name); + for (i = 0; i * ATTR_MAX_SIZE < max_size; ++i) { + v9fs_string_sprintf(&name, "user.virtfs.system.qidp.%d", i); + if (lremovexattr(ctx->fs_root, name.data) < 0) + break; + } + v9fs_string_free(&name); +} + +/* Used to convert qpp hash table into continuous stream. */ +static void qpp_table_serialize(void *p, uint32_t h, void *up) +{ + const QppEntry *entry = (const QppEntry*) p; + QppSerialize *ser = (QppSerialize*) up; + + if (ser->error) + return; + + /* safety first */ + if (entry->qp_prefix - 1 >= ser->count) { + ser->error = -1; + return; + } + + ser->elements[entry->qp_prefix - 1] = (QppEntryS) { + .dev = entry->dev, + .ino_prefix = entry->ino_prefix + }; + ser->done++; +} + +/* + * Tries to store the current qpp_table as extended attribute(s) on the + * exported file system root with the goal to preserve identical qids + * beyond the scope of reboots. + */ +static void v9fs_store_qpp_table(V9fsState *s) +{ + FsContext *ctx = &s->ctx; + V9fsString name; + int i, res; + size_t size; + QppSrlzStream* stream; + QppSerialize ser; + + if (is_ro_export(ctx)) + return; + + /* + * Whenever we exceeded some certain (arbitrary) high qpp_table size we + * delete the stored table from the file system to get rid of old device + * ids / inodes that might no longer exist with the goal to potentially + * yield in a smaller table size after next reboot. + */ + if (!s->qp_prefix_next || s->qp_prefix_next >= QPP_TABLE_PERSISTENCY_LIMIT) { + if (s->qp_prefix_next == QPP_TABLE_PERSISTENCY_LIMIT) { + remove_qidp_xattr(ctx); + } + return; + } + + /* Convert qpp hash table into continuous array. */ + size = sizeof(QppSrlzHeader) + + ( (s->qp_prefix_next - 1) /* qpp_table entry count */ * sizeof(QppEntryS) ); + stream = g_malloc0(size); + ser = (QppSerialize) { + .elements = &stream->elements[0], + .count = s->qp_prefix_next - 1, + .done = 0, + .error = 0, + }; + qht_iter(&s->qpp_table, qpp_table_serialize, &ser); + if (ser.error || ser.done != ser.count) + goto out; + + /* initialize header and calculate CRC32 checksum */ + stream->header = (QppSrlzHeader) { + .version = 1, + .reserved = 0, + .crc32 = crc32c( + 0xffffffff, + (const uint8_t*) &stream->elements[0], + (ser.count * sizeof(QppEntryS)) + ), + }; + + /* + * Actually just required if the qpp_table size decreased, or if the + * previous xattr size limit increased on OS (kernel/fs) level. + */ + remove_qidp_xattr(ctx); + + /* + * Subdivide (if required) the data stream into individual xattrs + * to cope with the system's max. supported xattr value size. + */ + v9fs_string_init(&name); + for (i = 0; size > (i * ATTR_MAX_SIZE); ++i) { + v9fs_string_sprintf(&name, "user.virtfs.system.qidp.%d", i); + res = lsetxattr( + ctx->fs_root, + name.data, + ((const uint8_t*)stream) + i * ATTR_MAX_SIZE, + MIN(ATTR_MAX_SIZE, size - i * ATTR_MAX_SIZE), + 0/*flags*/ + ); + if (res < 0) { + if (i > 0) + remove_qidp_xattr(ctx); + break; + } + } + v9fs_string_free(&name); +out: + g_free(stream); +} + +/* Frees the entire chain of passed nodes from memory. */ +static void destroy_xattr_nodes(XAttrNode **first) +{ + XAttrNode *prev; + if (!first) + return; + while (*first) { + if ((*first)->value) + g_free((*first)->value); + prev = *first; + *first = (*first)->next; + g_free(prev); + } +} + +/* + * Loads all user.virtfs.system.qidp.* xattrs from exported fs root and + * returns a linked list with one node per xattr. + */ +static XAttrNode* v9fs_load_qidp_xattr_nodes(V9fsState *s) +{ + FsContext *ctx = &s->ctx; + XAttrNode *first = NULL, *current = NULL; + V9fsString name; + ssize_t size; + int i; + + const ssize_t max_size = + sizeof(QppSrlzHeader) + + QPP_TABLE_PERSISTENCY_LIMIT * sizeof(QppEntryS); + + v9fs_string_init(&name); + + for (i = 0; i * ATTR_MAX_SIZE < max_size; ++i) { + v9fs_string_sprintf(&name, "user.virtfs.system.qidp.%d", i); + size = lgetxattr(ctx->fs_root, name.data, NULL, 0); + if (size <= 0) + break; + if (!first) { + first = current = g_malloc0(sizeof(XAttrNode)); + } else { + current = current->next = g_malloc0(sizeof(XAttrNode)); + } + current->value = g_malloc0(size); + current->length = lgetxattr( + ctx->fs_root, name.data, current->value, size + ); + if (current->length <= 0) { + goto out_w_err; + } + } + goto out; + +out_w_err: + destroy_xattr_nodes(&first); +out: + v9fs_string_free(&name); + return first; +} + +/* + * Try to load previously stored qpp_table from file system. Calling this + * function assumes that qpp_table is yet empty. + * + * @see v9fs_store_qpp_table() + */ +static void v9fs_load_qpp_table(V9fsState *s) +{ + ssize_t size, count; + XAttrNode *current, *first; + QppSrlzStream* stream = NULL; + uint32_t crc32; + int i; + QppEntry *val; + uint32_t hash; + + if (s->qp_prefix_next != 1) + return; + + first = v9fs_load_qidp_xattr_nodes(s); + if (!first) + return; + + /* convert nodes into continuous stream */ + size = 0; + for (current = first; current; current = current->next) { + size += current->length; + } + if (size <= 0) { + goto out; + } + stream = g_malloc0(size); + size = 0; + for (current = first; current; current = current->next) { + memcpy(((uint8_t*)stream) + size, current->value, current->length); + size += current->length; + } + + if (stream->header.version != 1) { + goto out; + } + + count = (size - sizeof(QppSrlzHeader)) / sizeof(QppEntryS); + if (count <= 0) { + goto out; + } + + /* verify CRC32 checksum of stream */ + crc32 = crc32c( + 0xffffffff, + (const uint8_t*) &stream->elements[0], + (count * sizeof(QppEntryS)) + ); + if (crc32 != stream->header.crc32) { + goto out; + } + + /* fill qpp_table with the retrieved elements */ + for (i = 0; i < count; ++i) { + val = g_malloc0(sizeof(QppEntry)); + *val = (QppEntry) { + .dev = stream->elements[i].dev, + .ino_prefix = stream->elements[i].ino_prefix, + }; + hash = qpp_hash(*val); + if (qht_lookup(&s->qpp_table, val, hash)) { + /* should never happen: duplicate entry detected */ + g_free(val); + goto out; + } + val->qp_prefix = s->qp_prefix_next++; + qht_insert(&s->qpp_table, val, hash, NULL); + } + +out: + destroy_xattr_nodes(&first); + if (stream) + g_free(stream); +} + /* stat_to_qid needs to map inode number (64 bits) and device id (32 bits) * to a unique QID path (64 bits). To avoid having to map and keep track * of up to 2^64 objects, we map only the 16 highest bits of the inode plus @@ -675,6 +967,14 @@ static int qid_path_prefixmap(V9fsPDU *pdu, const struct stat *stbuf, /* new unique inode prefix and device combo */ val->qp_prefix = pdu->s->qp_prefix_next++; qht_insert(&pdu->s->qpp_table, val, hash, NULL); + + /* + * Store qpp_table as extended attribute(s) to file system. + * + * TODO: This should better only be called from a guest shutdown and + * suspend handler. + */ + v9fs_store_qpp_table(pdu->s); } *path = ((uint64_t)val->qp_prefix << 48) | (stbuf->st_ino & QPATH_INO_MASK); @@ -1064,11 +1364,6 @@ static void v9fs_fix_path(V9fsPath *dst, V9fsPath *src, int len) v9fs_path_free(&str); } -static inline bool is_ro_export(FsContext *ctx) -{ - return ctx->export_flags & V9FS_RDONLY; -} - static void coroutine_fn v9fs_version(void *opaque) { ssize_t err; @@ -3784,6 +4079,8 @@ int v9fs_device_realize_common(V9fsState *s, const V9fsTransport *t, qht_init(&s->qpp_table, qpp_cmp_func, 1, QHT_MODE_AUTO_RESIZE); s->qp_prefix_next = 1; /* reserve 0 to detect overflow */ s->qp_fullpath_next = 1; + /* try to load and restore previous qpp_table */ + v9fs_load_qpp_table(s); s->ctx.fst = &fse->fst; fsdev_throttle_init(s->ctx.fst); @@ -3807,6 +4104,14 @@ out: void v9fs_device_unrealize_common(V9fsState *s, Error **errp) { + /* + * Store qpp_table as extended attribute(s) to file system. + * + * This was actually plan A, but unfortunately unserialize is not called + * reliably on guest shutdowns and suspensions. + */ + v9fs_store_qpp_table(s); + if (s->ops->cleanup) { s->ops->cleanup(&s->ctx); } diff --git a/hw/9pfs/9p.h b/hw/9pfs/9p.h index 44112ea97f..54ce039969 100644 --- a/hw/9pfs/9p.h +++ b/hw/9pfs/9p.h @@ -245,6 +245,13 @@ typedef struct { uint16_t qp_prefix; } QppEntry; +/* Small version of QppEntry for serialization as xattr. */ +struct QppEntryS { + dev_t dev; + uint16_t ino_prefix; +} __attribute__((packed)); +typedef struct QppEntryS QppEntryS; + /* QID path full entry, as above */ typedef struct { dev_t dev; @@ -252,6 +259,32 @@ typedef struct { uint64_t path; } QpfEntry; +typedef struct { + QppEntryS *elements; + uint count; /* In: QppEntryS count in @a elements */ + uint done; /* Out: how many QppEntryS did we actually fill in @a elements */ + int error; /* Out: zero on success */ +} QppSerialize; + +struct QppSrlzHeader { + uint16_t version; + uint16_t reserved; /* might be used e.g. for flags in future */ + uint32_t crc32; +} __attribute__((packed)); +typedef struct QppSrlzHeader QppSrlzHeader; + +struct QppSrlzStream { + QppSrlzHeader header; + QppEntryS elements[0]; +} __attribute__((packed)); +typedef struct QppSrlzStream QppSrlzStream; + +typedef struct XAttrNode { + uint8_t* value; + ssize_t length; + struct XAttrNode* next; +} XAttrNode; + struct V9fsState { QLIST_HEAD(, V9fsPDU) free_list;