Message ID | 20221219102022.2167736-8-bin.meng@windriver.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hw/9pfs: Add 9pfs support for Windows | expand |
On Monday, December 19, 2022 11:20:11 AM CET Bin Meng wrote: > From: Guohuai Shi <guohuai.shi@windriver.com> > > On Windows 'struct dirent' does not have current directory offset. > Update qemu_dirent_off() to support Windows. > > While we are here, add a build time check to error out if a new > host does not implement this helper. > > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com> > Signed-off-by: Bin Meng <bin.meng@windriver.com> > --- > > (no changes since v1) > > hw/9pfs/9p-util.h | 11 ++++++++--- > hw/9pfs/9p-util-win32.c | 7 +++++++ > hw/9pfs/9p.c | 4 ++-- > hw/9pfs/codir.c | 2 +- > 4 files changed, 18 insertions(+), 6 deletions(-) > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h > index 90420a7578..e395936b30 100644 > --- a/hw/9pfs/9p-util.h > +++ b/hw/9pfs/9p-util.h > @@ -127,6 +127,7 @@ int unlinkat_win32(int dirfd, const char *pathname, int flags); > int statfs_win32(const char *root_path, struct statfs *stbuf); > int openat_dir(int dirfd, const char *name); > int openat_file(int dirfd, const char *name, int flags, mode_t mode); > +off_t qemu_dirent_off_win32(void *s, void *fs); > #endif > > static inline void close_preserve_errno(int fd) > @@ -200,12 +201,16 @@ ssize_t fremovexattrat_nofollow(int dirfd, const char *filename, > * so ensure it is manually injected earlier and call here when > * needed. > */ > -static inline off_t qemu_dirent_off(struct dirent *dent) > +static inline off_t qemu_dirent_off(struct dirent *dent, void *s, void *fs) > { Not sure why you chose void* here. > -#ifdef CONFIG_DARWIN > +#if defined(CONFIG_DARWIN) > return dent->d_seekoff; > -#else > +#elif defined(CONFIG_LINUX) > return dent->d_off; > +#elif defined(CONFIG_WIN32) > + return qemu_dirent_off_win32(s, fs); > +#else > +#error Missing qemu_dirent_off() implementation for this host system > #endif > } > > diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c > index 7a270a7bd5..3592e057ce 100644 > --- a/hw/9pfs/9p-util-win32.c > +++ b/hw/9pfs/9p-util-win32.c > @@ -929,3 +929,10 @@ int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev) > errno = ENOTSUP; > return -1; > } > + > +off_t qemu_dirent_off_win32(void *s, void *fs) > +{ > + V9fsState *v9fs = s; > + > + return v9fs->ops->telldir(&v9fs->ctx, (V9fsFidOpenState *)fs); > +} That's a bit tricky. So far (i.e. for Linux host, macOS host) we are storing the directory offset directly to the dirent struct. We are not using telldir() as the stream might have mutated in the meantime, hence calling telldir() might return a value that does no longer correlate to dirent in question. Hence my idea was to use the same hack for Windows as we did for macOS, where we simply misuse a dirent field known to be not used, on macOS that's d_seekoff which we are misusing for that purpose. Looking at the mingw dirent.h header though, there is not much we can choose from. d_reclen is not used, but too short, d_ino is not used by mingw, but currently we actually read this field in server common code to assemble a file ID for guest. I mean it is always zero on Windows, so we could still misuse it, but we would need to inject more hacks there. :/ > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c > index 072cf67956..be247eeb30 100644 > --- a/hw/9pfs/9p.c > +++ b/hw/9pfs/9p.c > @@ -2336,7 +2336,7 @@ static int coroutine_fn v9fs_do_readdir_with_stat(V9fsPDU *pdu, > count += len; > v9fs_stat_free(&v9stat); > v9fs_path_free(&path); > - saved_dir_pos = qemu_dirent_off(dent); > + saved_dir_pos = qemu_dirent_off(dent, pdu->s, &fidp->fs); > } > > v9fs_readdir_unlock(&fidp->fs.dir); > @@ -2537,7 +2537,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, > qid.version = 0; > } > > - off = qemu_dirent_off(dent); > + off = qemu_dirent_off(dent, pdu->s, &fidp->fs); > v9fs_string_init(&name); > v9fs_string_sprintf(&name, "%s", dent->d_name); > > diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c > index 93ba44fb75..d40515a607 100644 > --- a/hw/9pfs/codir.c > +++ b/hw/9pfs/codir.c > @@ -168,7 +168,7 @@ static int do_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp, > } > > size += len; > - saved_dir_pos = qemu_dirent_off(dent); > + saved_dir_pos = qemu_dirent_off(dent, s, &fidp->fs); > } > > /* restore (last) saved position */ >
> -----Original Message----- > From: Christian Schoenebeck <qemu_oss@crudebyte.com> > Sent: Wednesday, December 21, 2022 22:48 > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org > Cc: Shi, Guohuai <Guohuai.Shi@windriver.com>; Meng, Bin > <Bin.Meng@windriver.com> > Subject: Re: [PATCH v3 07/17] hw/9pfs: Support getting current directory > offset for Windows > > CAUTION: This email comes from a non Wind River email account! > Do not click links or open attachments unless you recognize the sender and > know the content is safe. > > On Monday, December 19, 2022 11:20:11 AM CET Bin Meng wrote: > > From: Guohuai Shi <guohuai.shi@windriver.com> > > > > On Windows 'struct dirent' does not have current directory offset. > > Update qemu_dirent_off() to support Windows. > > > > While we are here, add a build time check to error out if a new host > > does not implement this helper. > > > > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com> > > Signed-off-by: Bin Meng <bin.meng@windriver.com> > > --- > > > > (no changes since v1) > > > > hw/9pfs/9p-util.h | 11 ++++++++--- > > hw/9pfs/9p-util-win32.c | 7 +++++++ > > hw/9pfs/9p.c | 4 ++-- > > hw/9pfs/codir.c | 2 +- > > 4 files changed, 18 insertions(+), 6 deletions(-) > > > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index > > 90420a7578..e395936b30 100644 > > --- a/hw/9pfs/9p-util.h > > +++ b/hw/9pfs/9p-util.h > > @@ -127,6 +127,7 @@ int unlinkat_win32(int dirfd, const char > > *pathname, int flags); int statfs_win32(const char *root_path, struct > > statfs *stbuf); int openat_dir(int dirfd, const char *name); int > > openat_file(int dirfd, const char *name, int flags, mode_t mode); > > +off_t qemu_dirent_off_win32(void *s, void *fs); > > #endif > > > > static inline void close_preserve_errno(int fd) @@ -200,12 +201,16 @@ > > ssize_t fremovexattrat_nofollow(int dirfd, const char *filename, > > * so ensure it is manually injected earlier and call here when > > * needed. > > */ > > -static inline off_t qemu_dirent_off(struct dirent *dent) > > +static inline off_t qemu_dirent_off(struct dirent *dent, void *s, > > +void *fs) > > { > > Not sure why you chose void* here. It is "V9fsState *" here, but 9p-util.h may be included by some other source file which is not have definition of "V9fsState". So change it to "void *" to prevent unnecessary type declarations. > > > -#ifdef CONFIG_DARWIN > > +#if defined(CONFIG_DARWIN) > > return dent->d_seekoff; > > -#else > > +#elif defined(CONFIG_LINUX) > > return dent->d_off; > > +#elif defined(CONFIG_WIN32) > > + return qemu_dirent_off_win32(s, fs); #else #error Missing > > +qemu_dirent_off() implementation for this host system > > #endif > > } > > > > diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index > > 7a270a7bd5..3592e057ce 100644 > > --- a/hw/9pfs/9p-util-win32.c > > +++ b/hw/9pfs/9p-util-win32.c > > @@ -929,3 +929,10 @@ int qemu_mknodat(int dirfd, const char *filename, > mode_t mode, dev_t dev) > > errno = ENOTSUP; > > return -1; > > } > > + > > +off_t qemu_dirent_off_win32(void *s, void *fs) { > > + V9fsState *v9fs = s; > > + > > + return v9fs->ops->telldir(&v9fs->ctx, (V9fsFidOpenState *)fs); } > > That's a bit tricky. So far (i.e. for Linux host, macOS host) we are storing > the directory offset directly to the dirent struct. We are not using > telldir() as the stream might have mutated in the meantime, hence calling > telldir() might return a value that does no longer correlate to dirent in > question. > > Hence my idea was to use the same hack for Windows as we did for macOS, where > we simply misuse a dirent field known to be not used, on macOS that's > d_seekoff which we are misusing for that purpose. > > Looking at the mingw dirent.h header though, there is not much we can choose > from. d_reclen is not used, but too short, d_ino is not used by mingw, but > currently we actually read this field in server common code to assemble a > file ID for guest. I mean it is always zero on Windows, so we could still > misuse it, but we would need to inject more hacks there. :/ If you check seekdir and telldir() implement in MinGW, you can see that MinGW (and Windows) does not have a safety way to seek/tell directory offset. Because Windows POSIX and native API does not provide a way to seek directory. Windows native API only allow to read directory forward, but not backward. So even we store the d_seekoff to some places, the directory may still be modified. And Windows file system actually has inode number, MinGW does not introduce it to MinGW API. So I think it is not good way to use d_ino. Thanks Guohuai > > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 072cf67956..be247eeb30 > > 100644 > > --- a/hw/9pfs/9p.c > > +++ b/hw/9pfs/9p.c > > @@ -2336,7 +2336,7 @@ static int coroutine_fn > v9fs_do_readdir_with_stat(V9fsPDU *pdu, > > count += len; > > v9fs_stat_free(&v9stat); > > v9fs_path_free(&path); > > - saved_dir_pos = qemu_dirent_off(dent); > > + saved_dir_pos = qemu_dirent_off(dent, pdu->s, &fidp->fs); > > } > > > > v9fs_readdir_unlock(&fidp->fs.dir); > > @@ -2537,7 +2537,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, > V9fsFidState *fidp, > > qid.version = 0; > > } > > > > - off = qemu_dirent_off(dent); > > + off = qemu_dirent_off(dent, pdu->s, &fidp->fs); > > v9fs_string_init(&name); > > v9fs_string_sprintf(&name, "%s", dent->d_name); > > > > diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c index > > 93ba44fb75..d40515a607 100644 > > --- a/hw/9pfs/codir.c > > +++ b/hw/9pfs/codir.c > > @@ -168,7 +168,7 @@ static int do_readdir_many(V9fsPDU *pdu, V9fsFidState > *fidp, > > } > > > > size += len; > > - saved_dir_pos = qemu_dirent_off(dent); > > + saved_dir_pos = qemu_dirent_off(dent, s, &fidp->fs); > > } > > > > /* restore (last) saved position */ > > > >
On Wednesday, December 21, 2022 7:02:43 PM CET Shi, Guohuai wrote: > > > -----Original Message----- > > From: Christian Schoenebeck <qemu_oss@crudebyte.com> > > Sent: Wednesday, December 21, 2022 22:48 > > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org > > Cc: Shi, Guohuai <Guohuai.Shi@windriver.com>; Meng, Bin > > <Bin.Meng@windriver.com> > > Subject: Re: [PATCH v3 07/17] hw/9pfs: Support getting current directory > > offset for Windows > > > > CAUTION: This email comes from a non Wind River email account! > > Do not click links or open attachments unless you recognize the sender and > > know the content is safe. > > > > On Monday, December 19, 2022 11:20:11 AM CET Bin Meng wrote: > > > From: Guohuai Shi <guohuai.shi@windriver.com> > > > > > > On Windows 'struct dirent' does not have current directory offset. > > > Update qemu_dirent_off() to support Windows. > > > > > > While we are here, add a build time check to error out if a new host > > > does not implement this helper. > > > > > > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com> > > > Signed-off-by: Bin Meng <bin.meng@windriver.com> > > > --- > > > > > > (no changes since v1) > > > > > > hw/9pfs/9p-util.h | 11 ++++++++--- > > > hw/9pfs/9p-util-win32.c | 7 +++++++ > > > hw/9pfs/9p.c | 4 ++-- > > > hw/9pfs/codir.c | 2 +- > > > 4 files changed, 18 insertions(+), 6 deletions(-) > > > > > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index > > > 90420a7578..e395936b30 100644 > > > --- a/hw/9pfs/9p-util.h > > > +++ b/hw/9pfs/9p-util.h > > > @@ -127,6 +127,7 @@ int unlinkat_win32(int dirfd, const char > > > *pathname, int flags); int statfs_win32(const char *root_path, struct > > > statfs *stbuf); int openat_dir(int dirfd, const char *name); int > > > openat_file(int dirfd, const char *name, int flags, mode_t mode); > > > +off_t qemu_dirent_off_win32(void *s, void *fs); > > > #endif > > > > > > static inline void close_preserve_errno(int fd) @@ -200,12 +201,16 @@ > > > ssize_t fremovexattrat_nofollow(int dirfd, const char *filename, > > > * so ensure it is manually injected earlier and call here when > > > * needed. > > > */ > > > -static inline off_t qemu_dirent_off(struct dirent *dent) > > > +static inline off_t qemu_dirent_off(struct dirent *dent, void *s, > > > +void *fs) > > > { > > > > Not sure why you chose void* here. > > It is "V9fsState *" here, but 9p-util.h may be included by some other source file which is not have definition of "V9fsState". > So change it to "void *" to prevent unnecessary type declarations. You can anonymously declare the struct to avoid cyclic dependencies (e.g. like in 9p.h): typedef struct V9fsState V9fsState; Avoid the void. > > > > > -#ifdef CONFIG_DARWIN > > > +#if defined(CONFIG_DARWIN) > > > return dent->d_seekoff; > > > -#else > > > +#elif defined(CONFIG_LINUX) > > > return dent->d_off; > > > +#elif defined(CONFIG_WIN32) > > > + return qemu_dirent_off_win32(s, fs); #else #error Missing > > > +qemu_dirent_off() implementation for this host system > > > #endif > > > } > > > > > > diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index > > > 7a270a7bd5..3592e057ce 100644 > > > --- a/hw/9pfs/9p-util-win32.c > > > +++ b/hw/9pfs/9p-util-win32.c > > > @@ -929,3 +929,10 @@ int qemu_mknodat(int dirfd, const char *filename, > > mode_t mode, dev_t dev) > > > errno = ENOTSUP; > > > return -1; > > > } > > > + > > > +off_t qemu_dirent_off_win32(void *s, void *fs) { > > > + V9fsState *v9fs = s; > > > + > > > + return v9fs->ops->telldir(&v9fs->ctx, (V9fsFidOpenState *)fs); } > > > > That's a bit tricky. So far (i.e. for Linux host, macOS host) we are storing > > the directory offset directly to the dirent struct. We are not using > > telldir() as the stream might have mutated in the meantime, hence calling > > telldir() might return a value that does no longer correlate to dirent in > > question. > > > > Hence my idea was to use the same hack for Windows as we did for macOS, where > > we simply misuse a dirent field known to be not used, on macOS that's > > d_seekoff which we are misusing for that purpose. > > > > Looking at the mingw dirent.h header though, there is not much we can choose > > from. d_reclen is not used, but too short, d_ino is not used by mingw, but > > currently we actually read this field in server common code to assemble a > > file ID for guest. I mean it is always zero on Windows, so we could still > > misuse it, but we would need to inject more hacks there. :/ > > If you check seekdir and telldir() implement in MinGW, > you can see that MinGW (and Windows) does not have a safety way to seek/tell directory offset. > Because Windows POSIX and native API does not provide a way to seek directory. > Windows native API only allow to read directory forward, but not backward. > So even we store the d_seekoff to some places, the directory may still be modified. > > And Windows file system actually has inode number, MinGW does not introduce it to MinGW API. > So I think it is not good way to use d_ino. Scratch that d_ino hack. My original concern was that we might get concurrency on the directory stream, however after a recap I stumbled over one of my comments on this: static int do_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp, struct V9fsDirEnt **entries, off_t offset, int32_t maxsize, bool dostat) { ... /* * TODO: Here should be a warn_report_once() if lock failed. * * With a good 9p client we should not get into concurrency here, * because a good client would not use the same fid for concurrent * requests. We do the lock here for safety reasons though. However * the client would then suffer performance issues, so better log that * issue here. */ v9fs_readdir_lock(&fidp->fs.dir); ... } So it boils down to whether or not we would want to care about broken 9p clients on Windows host or not, and considering the huge amount of code to deal with here for Windows host, we might say that we have more important (real) problems to deal with than caring about a broken 9p client that does not clone a FID before sending Treaddir (9p2000.L) or Tread (9p2000.u). However looking at current MinGW implementation I start to think whether we should use that API for directory retrieval at all, because for seekdir() they are rewinding the directory stream to the very beginning on *each* call of seekdir() and then readdir() in a while loop to the requested location for which they use a simple, self-incremented consecutive number as stream position: https://github.com/Alexpux/mingw-w64/blob/master/mingw-w64-crt/misc/dirent.c#L319 Which can lead to two problems: 1. n-square performance issue (minor issue). 2. Inconsistent state if directory is modified in between (major issue), e.g. the same directory appearing more than once in output, or directories not appearing at all in output even though they were neither newly created nor deleted. POSIX does not define what happens with deleted or newly created directories in between, but it guarantees a consistent state. What about calling _findfirst() and _findnext() directly, fetching all directory entries at once, closing the stream, and 9p would just access that snapshot instead? As long as the stream is not closed, Windows blocks all directory changes, so we would have a consistent state. > > > > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 072cf67956..be247eeb30 > > > 100644 > > > --- a/hw/9pfs/9p.c > > > +++ b/hw/9pfs/9p.c > > > @@ -2336,7 +2336,7 @@ static int coroutine_fn > > v9fs_do_readdir_with_stat(V9fsPDU *pdu, > > > count += len; > > > v9fs_stat_free(&v9stat); > > > v9fs_path_free(&path); > > > - saved_dir_pos = qemu_dirent_off(dent); > > > + saved_dir_pos = qemu_dirent_off(dent, pdu->s, &fidp->fs); > > > } > > > > > > v9fs_readdir_unlock(&fidp->fs.dir); > > > @@ -2537,7 +2537,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, > > V9fsFidState *fidp, > > > qid.version = 0; > > > } > > > > > > - off = qemu_dirent_off(dent); > > > + off = qemu_dirent_off(dent, pdu->s, &fidp->fs); > > > v9fs_string_init(&name); > > > v9fs_string_sprintf(&name, "%s", dent->d_name); > > > > > > diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c index > > > 93ba44fb75..d40515a607 100644 > > > --- a/hw/9pfs/codir.c > > > +++ b/hw/9pfs/codir.c > > > @@ -168,7 +168,7 @@ static int do_readdir_many(V9fsPDU *pdu, V9fsFidState > > *fidp, > > > } > > > > > > size += len; > > > - saved_dir_pos = qemu_dirent_off(dent); > > > + saved_dir_pos = qemu_dirent_off(dent, s, &fidp->fs); > > > } > > > > > > /* restore (last) saved position */ > > > > > > > > > >
> -----Original Message----- > From: Christian Schoenebeck <qemu_oss@crudebyte.com> > Sent: Wednesday, December 28, 2022 19:51 > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org > Cc: Meng, Bin <Bin.Meng@windriver.com>; Shi, Guohuai > <Guohuai.Shi@windriver.com> > Subject: Re: [PATCH v3 07/17] hw/9pfs: Support getting current directory > offset for Windows > > CAUTION: This email comes from a non Wind River email account! > Do not click links or open attachments unless you recognize the sender and > know the content is safe. > > On Wednesday, December 21, 2022 7:02:43 PM CET Shi, Guohuai wrote: > > > > > -----Original Message----- > > > From: Christian Schoenebeck <qemu_oss@crudebyte.com> > > > Sent: Wednesday, December 21, 2022 22:48 > > > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org > > > Cc: Shi, Guohuai <Guohuai.Shi@windriver.com>; Meng, Bin > > > <Bin.Meng@windriver.com> > > > Subject: Re: [PATCH v3 07/17] hw/9pfs: Support getting current > > > directory offset for Windows > > > > > > CAUTION: This email comes from a non Wind River email account! > > > Do not click links or open attachments unless you recognize the > > > sender and know the content is safe. > > > > > > On Monday, December 19, 2022 11:20:11 AM CET Bin Meng wrote: > > > > From: Guohuai Shi <guohuai.shi@windriver.com> > > > > > > > > On Windows 'struct dirent' does not have current directory offset. > > > > Update qemu_dirent_off() to support Windows. > > > > > > > > While we are here, add a build time check to error out if a new > > > > host does not implement this helper. > > > > > > > > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com> > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com> > > > > --- > > > > > > > > (no changes since v1) > > > > > > > > hw/9pfs/9p-util.h | 11 ++++++++--- > > > > hw/9pfs/9p-util-win32.c | 7 +++++++ > > > > hw/9pfs/9p.c | 4 ++-- > > > > hw/9pfs/codir.c | 2 +- > > > > 4 files changed, 18 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index > > > > 90420a7578..e395936b30 100644 > > > > --- a/hw/9pfs/9p-util.h > > > > +++ b/hw/9pfs/9p-util.h > > > > @@ -127,6 +127,7 @@ int unlinkat_win32(int dirfd, const char > > > > *pathname, int flags); int statfs_win32(const char *root_path, > > > > struct statfs *stbuf); int openat_dir(int dirfd, const char > > > > *name); int openat_file(int dirfd, const char *name, int flags, > > > > mode_t mode); > > > > +off_t qemu_dirent_off_win32(void *s, void *fs); > > > > #endif > > > > > > > > static inline void close_preserve_errno(int fd) @@ -200,12 > > > > +201,16 @@ ssize_t fremovexattrat_nofollow(int dirfd, const char > *filename, > > > > * so ensure it is manually injected earlier and call here when > > > > * needed. > > > > */ > > > > -static inline off_t qemu_dirent_off(struct dirent *dent) > > > > +static inline off_t qemu_dirent_off(struct dirent *dent, void *s, > > > > +void *fs) > > > > { > > > > > > Not sure why you chose void* here. > > > > It is "V9fsState *" here, but 9p-util.h may be included by some other > source file which is not have definition of "V9fsState". > > So change it to "void *" to prevent unnecessary type declarations. > > You can anonymously declare the struct to avoid cyclic dependencies (e.g. > like in 9p.h): > > typedef struct V9fsState V9fsState; > > Avoid the void. > Got it, understood. > > > > > > > -#ifdef CONFIG_DARWIN > > > > +#if defined(CONFIG_DARWIN) > > > > return dent->d_seekoff; > > > > -#else > > > > +#elif defined(CONFIG_LINUX) > > > > return dent->d_off; > > > > +#elif defined(CONFIG_WIN32) > > > > + return qemu_dirent_off_win32(s, fs); #else #error Missing > > > > +qemu_dirent_off() implementation for this host system > > > > #endif > > > > } > > > > > > > > diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c > > > > index 7a270a7bd5..3592e057ce 100644 > > > > --- a/hw/9pfs/9p-util-win32.c > > > > +++ b/hw/9pfs/9p-util-win32.c > > > > @@ -929,3 +929,10 @@ int qemu_mknodat(int dirfd, const char > > > > *filename, > > > mode_t mode, dev_t dev) > > > > errno = ENOTSUP; > > > > return -1; > > > > } > > > > + > > > > +off_t qemu_dirent_off_win32(void *s, void *fs) { > > > > + V9fsState *v9fs = s; > > > > + > > > > + return v9fs->ops->telldir(&v9fs->ctx, (V9fsFidOpenState > > > > + *)fs); } > > > > > > That's a bit tricky. So far (i.e. for Linux host, macOS host) we are > > > storing the directory offset directly to the dirent struct. We are > > > not using > > > telldir() as the stream might have mutated in the meantime, hence > > > calling > > > telldir() might return a value that does no longer correlate to > > > dirent in question. > > > > > > Hence my idea was to use the same hack for Windows as we did for > > > macOS, where we simply misuse a dirent field known to be not used, > > > on macOS that's d_seekoff which we are misusing for that purpose. > > > > > > Looking at the mingw dirent.h header though, there is not much we > > > can choose from. d_reclen is not used, but too short, d_ino is not > > > used by mingw, but currently we actually read this field in server > > > common code to assemble a file ID for guest. I mean it is always > > > zero on Windows, so we could still misuse it, but we would need to > > > inject more hacks there. :/ > > > > If you check seekdir and telldir() implement in MinGW, you can see > > that MinGW (and Windows) does not have a safety way to seek/tell directory > offset. > > Because Windows POSIX and native API does not provide a way to seek > directory. > > Windows native API only allow to read directory forward, but not backward. > > So even we store the d_seekoff to some places, the directory may still be > modified. > > > > And Windows file system actually has inode number, MinGW does not introduce > it to MinGW API. > > So I think it is not good way to use d_ino. > > Scratch that d_ino hack. My original concern was that we might get > concurrency on the directory stream, however after a recap I stumbled over > one of my comments on this: > > static int do_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp, > struct V9fsDirEnt **entries, off_t offset, > int32_t maxsize, bool dostat) { > ... > /* > * TODO: Here should be a warn_report_once() if lock failed. > * > * With a good 9p client we should not get into concurrency here, > * because a good client would not use the same fid for concurrent > * requests. We do the lock here for safety reasons though. However > * the client would then suffer performance issues, so better log that > * issue here. > */ > v9fs_readdir_lock(&fidp->fs.dir); > ... > } > > So it boils down to whether or not we would want to care about broken 9p > clients on Windows host or not, and considering the huge amount of code to > deal with here for Windows host, we might say that we have more important > (real) problems to deal with than caring about a broken 9p client that does > not clone a FID before sending Treaddir (9p2000.L) or Tread (9p2000.u). > > However looking at current MinGW implementation I start to think whether we > should use that API for directory retrieval at all, because for seekdir() > they are rewinding the directory stream to the very beginning on *each* call > of > seekdir() and then readdir() in a while loop to the requested location for > which they use a simple, self-incremented consecutive number as stream > position: > > https://github.com/Alexpux/mingw-w64/blob/master/mingw-w64- > crt/misc/dirent.c#L319 > > Which can lead to two problems: > > 1. n-square performance issue (minor issue). > > 2. Inconsistent state if directory is modified in between (major issue), e.g. > the same directory appearing more than once in output, or directories not > appearing at all in output even though they were neither newly created nor > deleted. POSIX does not define what happens with deleted or newly created > directories in between, but it guarantees a consistent state. > > What about calling _findfirst() and _findnext() directly, fetching all > directory entries at once, closing the stream, and 9p would just access that > snapshot instead? As long as the stream is not closed, Windows blocks all > directory changes, so we would have a consistent state. > I met the issue #2 you mentioned, you can see the comments I added in function local_seekdir(). Fetching all directory entries can resolve this issue, but it may need to allocate large memory if there are many entries. And we also need to re-write local_telldir(), local_rewinddir(), local_opendir(), local_closedir(). It would be a big change for 9pfs. > > > > > > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index > > > > 072cf67956..be247eeb30 > > > > 100644 > > > > --- a/hw/9pfs/9p.c > > > > +++ b/hw/9pfs/9p.c > > > > @@ -2336,7 +2336,7 @@ static int coroutine_fn > > > v9fs_do_readdir_with_stat(V9fsPDU *pdu, > > > > count += len; > > > > v9fs_stat_free(&v9stat); > > > > v9fs_path_free(&path); > > > > - saved_dir_pos = qemu_dirent_off(dent); > > > > + saved_dir_pos = qemu_dirent_off(dent, pdu->s, &fidp->fs); > > > > } > > > > > > > > v9fs_readdir_unlock(&fidp->fs.dir); > > > > @@ -2537,7 +2537,7 @@ static int coroutine_fn > > > > v9fs_do_readdir(V9fsPDU *pdu, > > > V9fsFidState *fidp, > > > > qid.version = 0; > > > > } > > > > > > > > - off = qemu_dirent_off(dent); > > > > + off = qemu_dirent_off(dent, pdu->s, &fidp->fs); > > > > v9fs_string_init(&name); > > > > v9fs_string_sprintf(&name, "%s", dent->d_name); > > > > > > > > diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c index > > > > 93ba44fb75..d40515a607 100644 > > > > --- a/hw/9pfs/codir.c > > > > +++ b/hw/9pfs/codir.c > > > > @@ -168,7 +168,7 @@ static int do_readdir_many(V9fsPDU *pdu, > > > > V9fsFidState > > > *fidp, > > > > } > > > > > > > > size += len; > > > > - saved_dir_pos = qemu_dirent_off(dent); > > > > + saved_dir_pos = qemu_dirent_off(dent, s, &fidp->fs); > > > > } > > > > > > > > /* restore (last) saved position */ > > > > > > > > > > > > > > > > >
On Thursday, December 29, 2022 7:03:54 AM CET Shi, Guohuai wrote: > > > -----Original Message----- > > From: Christian Schoenebeck <qemu_oss@crudebyte.com> > > Sent: Wednesday, December 28, 2022 19:51 > > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org > > Cc: Meng, Bin <Bin.Meng@windriver.com>; Shi, Guohuai > > <Guohuai.Shi@windriver.com> > > Subject: Re: [PATCH v3 07/17] hw/9pfs: Support getting current directory > > offset for Windows > > > > CAUTION: This email comes from a non Wind River email account! > > Do not click links or open attachments unless you recognize the sender and > > know the content is safe. > > > > On Wednesday, December 21, 2022 7:02:43 PM CET Shi, Guohuai wrote: > > > > > > > -----Original Message----- > > > > From: Christian Schoenebeck <qemu_oss@crudebyte.com> > > > > Sent: Wednesday, December 21, 2022 22:48 > > > > To: Greg Kurz <groug@kaod.org>; qemu-devel@nongnu.org > > > > Cc: Shi, Guohuai <Guohuai.Shi@windriver.com>; Meng, Bin > > > > <Bin.Meng@windriver.com> > > > > Subject: Re: [PATCH v3 07/17] hw/9pfs: Support getting current > > > > directory offset for Windows > > > > > > > > CAUTION: This email comes from a non Wind River email account! > > > > Do not click links or open attachments unless you recognize the > > > > sender and know the content is safe. > > > > > > > > On Monday, December 19, 2022 11:20:11 AM CET Bin Meng wrote: > > > > > From: Guohuai Shi <guohuai.shi@windriver.com> > > > > > > > > > > On Windows 'struct dirent' does not have current directory offset. > > > > > Update qemu_dirent_off() to support Windows. > > > > > > > > > > While we are here, add a build time check to error out if a new > > > > > host does not implement this helper. > > > > > > > > > > Signed-off-by: Guohuai Shi <guohuai.shi@windriver.com> > > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com> > > > > > --- > > > > > > > > > > (no changes since v1) > > > > > > > > > > hw/9pfs/9p-util.h | 11 ++++++++--- > > > > > hw/9pfs/9p-util-win32.c | 7 +++++++ > > > > > hw/9pfs/9p.c | 4 ++-- > > > > > hw/9pfs/codir.c | 2 +- > > > > > 4 files changed, 18 insertions(+), 6 deletions(-) > > > > > > > > > > diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index > > > > > 90420a7578..e395936b30 100644 > > > > > --- a/hw/9pfs/9p-util.h > > > > > +++ b/hw/9pfs/9p-util.h > > > > > @@ -127,6 +127,7 @@ int unlinkat_win32(int dirfd, const char > > > > > *pathname, int flags); int statfs_win32(const char *root_path, > > > > > struct statfs *stbuf); int openat_dir(int dirfd, const char > > > > > *name); int openat_file(int dirfd, const char *name, int flags, > > > > > mode_t mode); > > > > > +off_t qemu_dirent_off_win32(void *s, void *fs); > > > > > #endif > > > > > > > > > > static inline void close_preserve_errno(int fd) @@ -200,12 > > > > > +201,16 @@ ssize_t fremovexattrat_nofollow(int dirfd, const char > > *filename, > > > > > * so ensure it is manually injected earlier and call here when > > > > > * needed. > > > > > */ > > > > > -static inline off_t qemu_dirent_off(struct dirent *dent) > > > > > +static inline off_t qemu_dirent_off(struct dirent *dent, void *s, > > > > > +void *fs) > > > > > { > > > > > > > > Not sure why you chose void* here. > > > > > > It is "V9fsState *" here, but 9p-util.h may be included by some other > > source file which is not have definition of "V9fsState". > > > So change it to "void *" to prevent unnecessary type declarations. > > > > You can anonymously declare the struct to avoid cyclic dependencies (e.g. > > like in 9p.h): > > > > typedef struct V9fsState V9fsState; > > > > Avoid the void. > > > > Got it, understood. > > > > > > > > > > -#ifdef CONFIG_DARWIN > > > > > +#if defined(CONFIG_DARWIN) > > > > > return dent->d_seekoff; > > > > > -#else > > > > > +#elif defined(CONFIG_LINUX) > > > > > return dent->d_off; > > > > > +#elif defined(CONFIG_WIN32) > > > > > + return qemu_dirent_off_win32(s, fs); #else #error Missing > > > > > +qemu_dirent_off() implementation for this host system > > > > > #endif > > > > > } > > > > > > > > > > diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c > > > > > index 7a270a7bd5..3592e057ce 100644 > > > > > --- a/hw/9pfs/9p-util-win32.c > > > > > +++ b/hw/9pfs/9p-util-win32.c > > > > > @@ -929,3 +929,10 @@ int qemu_mknodat(int dirfd, const char > > > > > *filename, > > > > mode_t mode, dev_t dev) > > > > > errno = ENOTSUP; > > > > > return -1; > > > > > } > > > > > + > > > > > +off_t qemu_dirent_off_win32(void *s, void *fs) { > > > > > + V9fsState *v9fs = s; > > > > > + > > > > > + return v9fs->ops->telldir(&v9fs->ctx, (V9fsFidOpenState > > > > > + *)fs); } > > > > > > > > That's a bit tricky. So far (i.e. for Linux host, macOS host) we are > > > > storing the directory offset directly to the dirent struct. We are > > > > not using > > > > telldir() as the stream might have mutated in the meantime, hence > > > > calling > > > > telldir() might return a value that does no longer correlate to > > > > dirent in question. > > > > > > > > Hence my idea was to use the same hack for Windows as we did for > > > > macOS, where we simply misuse a dirent field known to be not used, > > > > on macOS that's d_seekoff which we are misusing for that purpose. > > > > > > > > Looking at the mingw dirent.h header though, there is not much we > > > > can choose from. d_reclen is not used, but too short, d_ino is not > > > > used by mingw, but currently we actually read this field in server > > > > common code to assemble a file ID for guest. I mean it is always > > > > zero on Windows, so we could still misuse it, but we would need to > > > > inject more hacks there. :/ > > > > > > If you check seekdir and telldir() implement in MinGW, you can see > > > that MinGW (and Windows) does not have a safety way to seek/tell directory > > offset. > > > Because Windows POSIX and native API does not provide a way to seek > > directory. > > > Windows native API only allow to read directory forward, but not backward. > > > So even we store the d_seekoff to some places, the directory may still be > > modified. > > > > > > And Windows file system actually has inode number, MinGW does not introduce > > it to MinGW API. > > > So I think it is not good way to use d_ino. > > > > Scratch that d_ino hack. My original concern was that we might get > > concurrency on the directory stream, however after a recap I stumbled over > > one of my comments on this: > > > > static int do_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp, > > struct V9fsDirEnt **entries, off_t offset, > > int32_t maxsize, bool dostat) { > > ... > > /* > > * TODO: Here should be a warn_report_once() if lock failed. > > * > > * With a good 9p client we should not get into concurrency here, > > * because a good client would not use the same fid for concurrent > > * requests. We do the lock here for safety reasons though. However > > * the client would then suffer performance issues, so better log that > > * issue here. > > */ > > v9fs_readdir_lock(&fidp->fs.dir); > > ... > > } > > > > So it boils down to whether or not we would want to care about broken 9p > > clients on Windows host or not, and considering the huge amount of code to > > deal with here for Windows host, we might say that we have more important > > (real) problems to deal with than caring about a broken 9p client that does > > not clone a FID before sending Treaddir (9p2000.L) or Tread (9p2000.u). > > > > However looking at current MinGW implementation I start to think whether we > > should use that API for directory retrieval at all, because for seekdir() > > they are rewinding the directory stream to the very beginning on *each* call > > of > > seekdir() and then readdir() in a while loop to the requested location for > > which they use a simple, self-incremented consecutive number as stream > > position: > > > > https://github.com/Alexpux/mingw-w64/blob/master/mingw-w64-> > crt/misc/dirent.c#L319 > > > > Which can lead to two problems: > > > > 1. n-square performance issue (minor issue). > > > > 2. Inconsistent state if directory is modified in between (major issue), e.g. > > the same directory appearing more than once in output, or directories not > > appearing at all in output even though they were neither newly created nor > > deleted. POSIX does not define what happens with deleted or newly created > > directories in between, but it guarantees a consistent state. > > > > What about calling _findfirst() and _findnext() directly, fetching all > > directory entries at once, closing the stream, and 9p would just access that > > snapshot instead? As long as the stream is not closed, Windows blocks all > > directory changes, so we would have a consistent state. > > > > I met the issue #2 you mentioned, you can see the comments I added in function local_seekdir(). > Fetching all directory entries can resolve this issue, but it may need to allocate large memory if there are many entries. True, considering that NTFS allows up to (2^32 - 1) = 4,294,967,295 entries per directory, that's a bit much to cache in RAM. > And we also need to re-write local_telldir(), local_rewinddir(), local_opendir(), local_closedir(). > It would be a big change for 9pfs. I fear this is an issue that cannot be neglected nor postponed. All non modified directory entries must be returned to client, and no duplicates may occur, otherwise you would get severe misbehaviours on Windows hosts. I just checked how this is solved in other apps, and apparently it is common to use b-tree cookies (i.e. search result cache on disk). I would consider that as a long-term solution though, as it would be a lot of work on its own. I also noticed that Microsoft does actually have a continuation token API for this purpose, but unfortunately it seems to be limited to Azure services. What about this for now: 1. Adding qemu_readdir(), qemu_seekdir(), qemu_telldir(); for Windows adding custom implementation (as described below), all other platforms just define them directly to readdir(), seekdir(), telldir(). 2. In the Windows implementation of those functions: calling the _findfirst() and _findnext() functions directly (not the MinGW ones) and returning each directory entry sorted by inode number: + Relatively straight-forward to implement. + No (major) changes in 9pfs code base required. - Still n-square performance issue (neglectable to land Windows host support IMO). o Consistency assured for "most" cases, except one: if hardlinks are inserted in between then it might fail. > > > > > > > > > > diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index > > > > > 072cf67956..be247eeb30 > > > > > 100644 > > > > > --- a/hw/9pfs/9p.c > > > > > +++ b/hw/9pfs/9p.c > > > > > @@ -2336,7 +2336,7 @@ static int coroutine_fn > > > > v9fs_do_readdir_with_stat(V9fsPDU *pdu, > > > > > count += len; > > > > > v9fs_stat_free(&v9stat); > > > > > v9fs_path_free(&path); > > > > > - saved_dir_pos = qemu_dirent_off(dent); > > > > > + saved_dir_pos = qemu_dirent_off(dent, pdu->s, &fidp->fs); > > > > > } > > > > > > > > > > v9fs_readdir_unlock(&fidp->fs.dir); > > > > > @@ -2537,7 +2537,7 @@ static int coroutine_fn > > > > > v9fs_do_readdir(V9fsPDU *pdu, > > > > V9fsFidState *fidp, > > > > > qid.version = 0; > > > > > } > > > > > > > > > > - off = qemu_dirent_off(dent); > > > > > + off = qemu_dirent_off(dent, pdu->s, &fidp->fs); > > > > > v9fs_string_init(&name); > > > > > v9fs_string_sprintf(&name, "%s", dent->d_name); > > > > > > > > > > diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c index > > > > > 93ba44fb75..d40515a607 100644 > > > > > --- a/hw/9pfs/codir.c > > > > > +++ b/hw/9pfs/codir.c > > > > > @@ -168,7 +168,7 @@ static int do_readdir_many(V9fsPDU *pdu, > > > > > V9fsFidState > > > > *fidp, > > > > > } > > > > > > > > > > size += len; > > > > > - saved_dir_pos = qemu_dirent_off(dent); > > > > > + saved_dir_pos = qemu_dirent_off(dent, s, &fidp->fs); > > > > > } > > > > > > > > > > /* restore (last) saved position */ > > > > > > > > > > > > > > > > > > > > > > > > > > >
diff --git a/hw/9pfs/9p-util.h b/hw/9pfs/9p-util.h index 90420a7578..e395936b30 100644 --- a/hw/9pfs/9p-util.h +++ b/hw/9pfs/9p-util.h @@ -127,6 +127,7 @@ int unlinkat_win32(int dirfd, const char *pathname, int flags); int statfs_win32(const char *root_path, struct statfs *stbuf); int openat_dir(int dirfd, const char *name); int openat_file(int dirfd, const char *name, int flags, mode_t mode); +off_t qemu_dirent_off_win32(void *s, void *fs); #endif static inline void close_preserve_errno(int fd) @@ -200,12 +201,16 @@ ssize_t fremovexattrat_nofollow(int dirfd, const char *filename, * so ensure it is manually injected earlier and call here when * needed. */ -static inline off_t qemu_dirent_off(struct dirent *dent) +static inline off_t qemu_dirent_off(struct dirent *dent, void *s, void *fs) { -#ifdef CONFIG_DARWIN +#if defined(CONFIG_DARWIN) return dent->d_seekoff; -#else +#elif defined(CONFIG_LINUX) return dent->d_off; +#elif defined(CONFIG_WIN32) + return qemu_dirent_off_win32(s, fs); +#else +#error Missing qemu_dirent_off() implementation for this host system #endif } diff --git a/hw/9pfs/9p-util-win32.c b/hw/9pfs/9p-util-win32.c index 7a270a7bd5..3592e057ce 100644 --- a/hw/9pfs/9p-util-win32.c +++ b/hw/9pfs/9p-util-win32.c @@ -929,3 +929,10 @@ int qemu_mknodat(int dirfd, const char *filename, mode_t mode, dev_t dev) errno = ENOTSUP; return -1; } + +off_t qemu_dirent_off_win32(void *s, void *fs) +{ + V9fsState *v9fs = s; + + return v9fs->ops->telldir(&v9fs->ctx, (V9fsFidOpenState *)fs); +} diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c index 072cf67956..be247eeb30 100644 --- a/hw/9pfs/9p.c +++ b/hw/9pfs/9p.c @@ -2336,7 +2336,7 @@ static int coroutine_fn v9fs_do_readdir_with_stat(V9fsPDU *pdu, count += len; v9fs_stat_free(&v9stat); v9fs_path_free(&path); - saved_dir_pos = qemu_dirent_off(dent); + saved_dir_pos = qemu_dirent_off(dent, pdu->s, &fidp->fs); } v9fs_readdir_unlock(&fidp->fs.dir); @@ -2537,7 +2537,7 @@ static int coroutine_fn v9fs_do_readdir(V9fsPDU *pdu, V9fsFidState *fidp, qid.version = 0; } - off = qemu_dirent_off(dent); + off = qemu_dirent_off(dent, pdu->s, &fidp->fs); v9fs_string_init(&name); v9fs_string_sprintf(&name, "%s", dent->d_name); diff --git a/hw/9pfs/codir.c b/hw/9pfs/codir.c index 93ba44fb75..d40515a607 100644 --- a/hw/9pfs/codir.c +++ b/hw/9pfs/codir.c @@ -168,7 +168,7 @@ static int do_readdir_many(V9fsPDU *pdu, V9fsFidState *fidp, } size += len; - saved_dir_pos = qemu_dirent_off(dent); + saved_dir_pos = qemu_dirent_off(dent, s, &fidp->fs); } /* restore (last) saved position */