Message ID | cover.1702148009.git.kreijack@inwind.it (mailing list archive) |
---|---|
Headers | show |
Series | Remove unused dirstream variable | expand |
On Sat, Dec 09, 2023 at 07:53:20PM +0100, Goffredo Baroncelli wrote: > From: Goffredo Baroncelli <kreijack@inwind.it> > > For historical reason, the helpers [btrfs_]open_[file_or_]dir() work with > directory returning the 'fd' and a 'dirstream' variable returned by > opendir(3). > > If the path is a file, the 'fd' is computed from open(2) and > dirstream is set to NULL. > If the path is a directory, first the directory is opened by opendir(3), then > the 'fd' is computed using dirfd(3). > However the 'dirstream' returned by opendir(3) is left open until 'fd' > is not needed anymore. > > In near every case the 'dirstream' variable is not used. Only 'fd' is > used. As I'm reading dirfd manual page, dirfd returns the internal file descriptor of the dirstream and it gets closed after call to closedir(). This means if we pass a directory and want a file descriptor then its lifetime matches the correspoinding DIR. > A call to close_file_or_dir() freed both 'fd' and 'dirstream'. > > Aim of this patch set is to getrid of this complexity; when the path of > a directory is passed, the fd is get directly using open(path, O_RDONLY): > so we don't need to use readdir(3) and to maintain the not used variable > 'dirstream'. Does this work the same way as with the dirstream?
On 14/12/2023 17.17, David Sterba wrote: > On Sat, Dec 09, 2023 at 07:53:20PM +0100, Goffredo Baroncelli wrote: >> From: Goffredo Baroncelli <kreijack@inwind.it> >> [...] > > As I'm reading dirfd manual page, dirfd returns the internal file > descriptor of the dirstream and it gets closed after call to closedir(). > This means if we pass a directory and want a file descriptor then its > lifetime matches the correspoinding DIR. Yes, more below > >> A call to close_file_or_dir() freed both 'fd' and 'dirstream'. >> >> Aim of this patch set is to getrid of this complexity; when the path of >> a directory is passed, the fd is get directly using open(path, O_RDONLY): >> so we don't need to use readdir(3) and to maintain the not used variable >> 'dirstream'. > > Does this work the same way as with the dirstream? Yes; I tested my patches and these worked properly. My understanding is that opendir(3)/readdir(3) are based to the pair open(2)/getdents64(2). opendir(3) does a "open(path, O_RDONLY)", and store the file descriptors in a malloc-ed DIR structure. Then closedir(3) closes the fd and free the DIR structure.. You may see further details about the glib implementation. https://github.com/kraj/glibc/blob/8757659b3cee57c1defee7772c3ee687d310b076/sysdeps/unix/sysv/linux/opendir.c#L81 The BIONIC implementation does the same thing. The key point is that you may open(2) a directory only in O_RDONLY mode. In fact you cannot write to a directory. You may only read(); to "write" you have to use a different API like rmdir/mkdir/mknod or open("<dir>/<filename>", O_RDWR"). Another test that I did was: ghigo@venice:/tmp$ cat test.c #include <sys/types.h> #include <dirent.h> int main(int argc, char **argv) { DIR *dir = opendir(argv[1]); return (int)dir; } ghigo@venice:/tmp$ gcc -Wno-pointer-to-int-cast -o test test.c ghigo@venice:/tmp$ mkdir test2 ghigo@venice:/tmp$ strace ./test test2 [...] prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0 munmap(0x7f328b589000, 222607) = 0 openat(AT_FDCWD, "test2", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3 <---------- newfstatat(3, "", {st_mode=S_IFDIR|0755, st_size=0, ...}, AT_EMPTY_PATH) = 0 getrandom("\x04\x76\xb1\xe0\xee\xd8\x8a\xc8", 8, GRND_NONBLOCK) = 8 brk(NULL) = 0x55676fffd000 brk(0x55677001e000) = 0x55677001e000 exit_group(1879036576) = ? +++ exited with 160 +++ ghigo@venice:/tmp$ BR G.Baroncelli
On 14/12/2023 17.17, David Sterba wrote: > On Sat, Dec 09, 2023 at 07:53:20PM +0100, Goffredo Baroncelli wrote: >> From: Goffredo Baroncelli <kreijack@inwind.it> >> >> For historical reason, the helpers [btrfs_]open_[file_or_]dir() work with >> directory returning the 'fd' and a 'dirstream' variable returned by >> opendir(3). >> >> If the path is a file, the 'fd' is computed from open(2) and >> dirstream is set to NULL. >> If the path is a directory, first the directory is opened by opendir(3), then >> the 'fd' is computed using dirfd(3). >> However the 'dirstream' returned by opendir(3) is left open until 'fd' >> is not needed anymore. >> >> In near every case the 'dirstream' variable is not used. Only 'fd' is >> used. > > As I'm reading dirfd manual page, dirfd returns the internal file > descriptor of the dirstream and it gets closed after call to closedir(). > This means if we pass a directory and want a file descriptor then its > lifetime matches the correspoinding DIR. > >> A call to close_file_or_dir() freed both 'fd' and 'dirstream'. >> >> Aim of this patch set is to getrid of this complexity; when the path of >> a directory is passed, the fd is get directly using open(path, O_RDONLY): >> so we don't need to use readdir(3) and to maintain the not used variable >> 'dirstream'. > > Does this work the same way as with the dirstream? > Hi David, are you interested in this patch ? I think that it is a great simplification. BR
On Sat, Dec 30, 2023 at 12:20:54PM +0100, Goffredo Baroncelli wrote: > On 14/12/2023 17.17, David Sterba wrote: > > On Sat, Dec 09, 2023 at 07:53:20PM +0100, Goffredo Baroncelli wrote: > >> From: Goffredo Baroncelli <kreijack@inwind.it> > >> > >> For historical reason, the helpers [btrfs_]open_[file_or_]dir() work with > >> directory returning the 'fd' and a 'dirstream' variable returned by > >> opendir(3). > >> > >> If the path is a file, the 'fd' is computed from open(2) and > >> dirstream is set to NULL. > >> If the path is a directory, first the directory is opened by opendir(3), then > >> the 'fd' is computed using dirfd(3). > >> However the 'dirstream' returned by opendir(3) is left open until 'fd' > >> is not needed anymore. > >> > >> In near every case the 'dirstream' variable is not used. Only 'fd' is > >> used. > > > > As I'm reading dirfd manual page, dirfd returns the internal file > > descriptor of the dirstream and it gets closed after call to closedir(). > > This means if we pass a directory and want a file descriptor then its > > lifetime matches the correspoinding DIR. > > > >> A call to close_file_or_dir() freed both 'fd' and 'dirstream'. > >> > >> Aim of this patch set is to getrid of this complexity; when the path of > >> a directory is passed, the fd is get directly using open(path, O_RDONLY): > >> so we don't need to use readdir(3) and to maintain the not used variable > >> 'dirstream'. > > > > Does this work the same way as with the dirstream? > > > > Hi David, are you interested in this patch ? I think that it is a > great simplification. Yes, it's a good cleanup, I'll get to that after 6.7 is released.
On Sat, Dec 30, 2023 at 12:20:54PM +0100, Goffredo Baroncelli wrote: > On 14/12/2023 17.17, David Sterba wrote: > > On Sat, Dec 09, 2023 at 07:53:20PM +0100, Goffredo Baroncelli wrote: > >> From: Goffredo Baroncelli <kreijack@inwind.it> > >> > >> For historical reason, the helpers [btrfs_]open_[file_or_]dir() work with > >> directory returning the 'fd' and a 'dirstream' variable returned by > >> opendir(3). > >> > >> If the path is a file, the 'fd' is computed from open(2) and > >> dirstream is set to NULL. > >> If the path is a directory, first the directory is opened by opendir(3), then > >> the 'fd' is computed using dirfd(3). > >> However the 'dirstream' returned by opendir(3) is left open until 'fd' > >> is not needed anymore. > >> > >> In near every case the 'dirstream' variable is not used. Only 'fd' is > >> used. > > > > As I'm reading dirfd manual page, dirfd returns the internal file > > descriptor of the dirstream and it gets closed after call to closedir(). > > This means if we pass a directory and want a file descriptor then its > > lifetime matches the correspoinding DIR. > > > >> A call to close_file_or_dir() freed both 'fd' and 'dirstream'. > >> > >> Aim of this patch set is to getrid of this complexity; when the path of > >> a directory is passed, the fd is get directly using open(path, O_RDONLY): > >> so we don't need to use readdir(3) and to maintain the not used variable > >> 'dirstream'. > > > > Does this work the same way as with the dirstream? > > > > Hi David, are you interested in this patch ? I think that it is a > great simplification. Sorry, I got distracted by other things. The patchset does not apply cleanly on 6.7 so I've used 6.6.3 as the base for testing. There's one failure in the cli tests, result can be seen here https://github.com/kdave/btrfs-progs/actions/runs/7813042695/job/21311365019 ====== RUN MUSTFAIL /home/runner/work/btrfs-progs/btrfs-progs/btrfs filesystem resize 1:-128M /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img ERROR: not a btrfs filesystem: /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img failed (expected): /home/runner/work/btrfs-progs/btrfs-progs/btrfs filesystem resize 1:-128M /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img no expected error message in the output 2 test failed for case 003-fi-resize-args I think it's related to the changes to dirstream, however I can't reproduce it locally, only on the github actions CI. Unlike other commands in the test, this one is called on an image and it must fail to prevent accidentaly resizing underlying filesystem.
On 07/02/2024 11.16, David Sterba wrote: > On Sat, Dec 30, 2023 at 12:20:54PM +0100, Goffredo Baroncelli wrote: >> On 14/12/2023 17.17, David Sterba wrote: >>> On Sat, Dec 09, 2023 at 07:53:20PM +0100, Goffredo Baroncelli wrote: >>>> From: Goffredo Baroncelli <kreijack@inwind.it> >>>> >>>> For historical reason, the helpers [btrfs_]open_[file_or_]dir() work with >>>> directory returning the 'fd' and a 'dirstream' variable returned by >>>> opendir(3). >>>> >>>> If the path is a file, the 'fd' is computed from open(2) and >>>> dirstream is set to NULL. >>>> If the path is a directory, first the directory is opened by opendir(3), then >>>> the 'fd' is computed using dirfd(3). >>>> However the 'dirstream' returned by opendir(3) is left open until 'fd' >>>> is not needed anymore. >>>> >>>> In near every case the 'dirstream' variable is not used. Only 'fd' is >>>> used. >>> >>> As I'm reading dirfd manual page, dirfd returns the internal file >>> descriptor of the dirstream and it gets closed after call to closedir(). >>> This means if we pass a directory and want a file descriptor then its >>> lifetime matches the correspoinding DIR. >>> >>>> A call to close_file_or_dir() freed both 'fd' and 'dirstream'. >>>> >>>> Aim of this patch set is to getrid of this complexity; when the path of >>>> a directory is passed, the fd is get directly using open(path, O_RDONLY): >>>> so we don't need to use readdir(3) and to maintain the not used variable >>>> 'dirstream'. >>> >>> Does this work the same way as with the dirstream? >>> >> >> Hi David, are you interested in this patch ? I think that it is a >> great simplification. > > Sorry, I got distracted by other things. > > The patchset does not apply cleanly on 6.7 so I've used 6.6.3 as the > base for testing. There's one failure in the cli tests, result can be > seen here https://github.com/kdave/btrfs-progs/actions/runs/7813042695/job/21311365019 > > ====== RUN MUSTFAIL /home/runner/work/btrfs-progs/btrfs-progs/btrfs filesystem resize 1:-128M /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img > ERROR: not a btrfs filesystem: /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img > failed (expected): /home/runner/work/btrfs-progs/btrfs-progs/btrfs filesystem resize 1:-128M /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img > no expected error message in the output 2 > test failed for case 003-fi-resize-args > > I think it's related to the changes to dirstream, however I can't > reproduce it locally, only on the github actions CI. > > Unlike other commands in the test, this one is called on an image and it > must fail to prevent accidentaly resizing underlying filesystem. I will rebase on 6.7 and I will try to reproduce the problem. Could you share the full script, or point me where is it ? BR
On 08/02/2024 18.28, Goffredo Baroncelli wrote: > On 07/02/2024 11.16, David Sterba wrote: >> On Sat, Dec 30, 2023 at 12:20:54PM +0100, Goffredo Baroncelli wrote: >>> On 14/12/2023 17.17, David Sterba wrote: >>>> On Sat, Dec 09, 2023 at 07:53:20PM +0100, Goffredo Baroncelli wrote: >>>>> From: Goffredo Baroncelli <kreijack@inwind.it> >>>>> >>>>> For historical reason, the helpers [btrfs_]open_[file_or_]dir() work with >>>>> directory returning the 'fd' and a 'dirstream' variable returned by >>>>> opendir(3). >>>>> >>>>> If the path is a file, the 'fd' is computed from open(2) and >>>>> dirstream is set to NULL. >>>>> If the path is a directory, first the directory is opened by opendir(3), then >>>>> the 'fd' is computed using dirfd(3). >>>>> However the 'dirstream' returned by opendir(3) is left open until 'fd' >>>>> is not needed anymore. >>>>> >>>>> In near every case the 'dirstream' variable is not used. Only 'fd' is >>>>> used. >>>> >>>> As I'm reading dirfd manual page, dirfd returns the internal file >>>> descriptor of the dirstream and it gets closed after call to closedir(). >>>> This means if we pass a directory and want a file descriptor then its >>>> lifetime matches the correspoinding DIR. >>>> >>>>> A call to close_file_or_dir() freed both 'fd' and 'dirstream'. >>>>> >>>>> Aim of this patch set is to getrid of this complexity; when the path of >>>>> a directory is passed, the fd is get directly using open(path, O_RDONLY): >>>>> so we don't need to use readdir(3) and to maintain the not used variable >>>>> 'dirstream'. >>>> >>>> Does this work the same way as with the dirstream? >>>> >>> >>> Hi David, are you interested in this patch ? I think that it is a >>> great simplification. >> >> Sorry, I got distracted by other things. >> >> The patchset does not apply cleanly on 6.7 so I've used 6.6.3 as the >> base for testing. There's one failure in the cli tests, result can be >> seen here https://github.com/kdave/btrfs-progs/actions/runs/7813042695/job/21311365019 >> >> ====== RUN MUSTFAIL /home/runner/work/btrfs-progs/btrfs-progs/btrfs filesystem resize 1:-128M /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img >> ERROR: not a btrfs filesystem: /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img >> failed (expected): /home/runner/work/btrfs-progs/btrfs-progs/btrfs filesystem resize 1:-128M /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img >> no expected error message in the output 2 >> test failed for case 003-fi-resize-args >> >> I think it's related to the changes to dirstream, however I can't >> reproduce it locally, only on the github actions CI. >> >> Unlike other commands in the test, this one is called on an image and it >> must fail to prevent accidentaly resizing underlying filesystem. > > > I will rebase on 6.7 and I will try to reproduce the problem. Could you share the full script, or point me where is it ? Ignore my latter request. I found the test code. > > BR
On 08/02/2024 18.34, Goffredo Baroncelli wrote: > On 08/02/2024 18.28, Goffredo Baroncelli wrote: >> On 07/02/2024 11.16, David Sterba wrote: >>> On Sat, Dec 30, 2023 at 12:20:54PM +0100, Goffredo Baroncelli wrote: >>>> On 14/12/2023 17.17, David Sterba wrote: >>>>> On Sat, Dec 09, 2023 at 07:53:20PM +0100, Goffredo Baroncelli wrote: >>>>>> From: Goffredo Baroncelli <kreijack@inwind.it> >>>>>> >>>>>> For historical reason, the helpers [btrfs_]open_[file_or_]dir() work with >>>>>> directory returning the 'fd' and a 'dirstream' variable returned by >>>>>> opendir(3). >>>>>> >>>>>> If the path is a file, the 'fd' is computed from open(2) and >>>>>> dirstream is set to NULL. >>>>>> If the path is a directory, first the directory is opened by opendir(3), then >>>>>> the 'fd' is computed using dirfd(3). >>>>>> However the 'dirstream' returned by opendir(3) is left open until 'fd' >>>>>> is not needed anymore. >>>>>> >>>>>> In near every case the 'dirstream' variable is not used. Only 'fd' is >>>>>> used. >>>>> >>>>> As I'm reading dirfd manual page, dirfd returns the internal file >>>>> descriptor of the dirstream and it gets closed after call to closedir(). >>>>> This means if we pass a directory and want a file descriptor then its >>>>> lifetime matches the correspoinding DIR. >>>>> >>>>>> A call to close_file_or_dir() freed both 'fd' and 'dirstream'. >>>>>> >>>>>> Aim of this patch set is to getrid of this complexity; when the path of >>>>>> a directory is passed, the fd is get directly using open(path, O_RDONLY): >>>>>> so we don't need to use readdir(3) and to maintain the not used variable >>>>>> 'dirstream'. >>>>> >>>>> Does this work the same way as with the dirstream? >>>>> >>>> >>>> Hi David, are you interested in this patch ? I think that it is a >>>> great simplification. >>> >>> Sorry, I got distracted by other things. >>> >>> The patchset does not apply cleanly on 6.7 so I've used 6.6.3 as the >>> base for testing. There's one failure in the cli tests, result can be >>> seen here https://github.com/kdave/btrfs-progs/actions/runs/7813042695/job/21311365019 >>> >>> ====== RUN MUSTFAIL /home/runner/work/btrfs-progs/btrfs-progs/btrfs filesystem resize 1:-128M /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img >>> ERROR: not a btrfs filesystem: /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img >>> failed (expected): /home/runner/work/btrfs-progs/btrfs-progs/btrfs filesystem resize 1:-128M /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img >>> no expected error message in the output 2 >>> test failed for case 003-fi-resize-args >>> >>> I think it's related to the changes to dirstream, however I can't >>> reproduce it locally, only on the github actions CI. I found the problem, and I have an idea why you cannot reproduce the problem The test btrfs filesystem resize 1:-128M /home/runner/work/btrfs-progs/btrfs-progs/tests/test.img Expect the following error message: ERROR: resize works on mounted filesystems and accepts only However if you pass a file image on a NON btrfs fs, after my patch you got the following error: ERROR: not a btrfs filesystem: /home/ghigo/btrfs/btrfs-progs/tests/test.img But if you do the test on a BTRFS filesystem, you get the following error ERROR: not a directory: /mnt/btrfs-raid1/test.img ERROR: resize works on mounted filesystems and accepts only directories as argument. Passing file containing a btrfs image would resize the underlying filesystem instead of the image. Basically, my patch rearranged the tests as: - is a btrfs filesystem - is a directory where before the tests were - is a directory - is a btrfs filesystem In the test 013-xxxx before failed the test "is a directory", now the test "is a btrfs filesystem" fails first. The code fails here: cmds/filesystem.c: static int cmd_filesystem_resize(const struct cmd_struct *cmd, [...] fd = btrfs_open_dir_fd(path); if (fd < 0) { /* The path is a directory */ if (fd == -3) { error( "resize works on mounted filesystems and accepts only\n" "directories as argument. Passing file containing a btrfs image\n" "would resize the underlying filesystem instead of the image.\n"); } return 1; } However the check implemented in btrfs_open_dir_fd() are: btrfs_open_dir_fd() btrfs_open_fd_2() [...] if (stat(path, &st) != 0) { error_on(verbose, "cannot access '%s': %m", path); return -1; } if (statfs(path, &stfs) != 0) { error_on(verbose, "cannot access '%s': %m", path); return -1; } if (stfs.f_type != BTRFS_SUPER_MAGIC) { error_on(verbose, "not a btrfs filesystem: %s", path); // <---- NOW fail first here IF the filesystem is a not return -2; // btrfs } if (dir_only && !S_ISDIR(st.st_mode)) { error_on(verbose, "not a directory: %s", path); // <----- BEFORE failed here return -3; } if (S_ISDIR(st.st_mode) || !read_write) ret = open(path, O_RDONLY); else ret = open(path, O_RDWR); if (ret < 0) { error_on(verbose, "cannot access '%s': %m", path); } IN order to solve the issue, I swapped the check orders; to be consistent with the previous errors. Soon I will send you a V2 version >>> >>> Unlike other commands in the test, this one is called on an image and it >>> must fail to prevent accidentaly resizing underlying filesystem. >> >> >> I will rebase on 6.7 and I will try to reproduce the problem. Could you share the full script, or point me where is it ? > > Ignore my latter request. I found the test code. > >> >> BR >
From: Goffredo Baroncelli <kreijack@inwind.it> For historical reason, the helpers [btrfs_]open_[file_or_]dir() work with directory returning the 'fd' and a 'dirstream' variable returned by opendir(3). If the path is a file, the 'fd' is computed from open(2) and dirstream is set to NULL. If the path is a directory, first the directory is opened by opendir(3), then the 'fd' is computed using dirfd(3). However the 'dirstream' returned by opendir(3) is left open until 'fd' is not needed anymore. In near every case the 'dirstream' variable is not used. Only 'fd' is used. A call to close_file_or_dir() freed both 'fd' and 'dirstream'. Aim of this patch set is to getrid of this complexity; when the path of a directory is passed, the fd is get directly using open(path, O_RDONLY): so we don't need to use readdir(3) and to maintain the not used variable 'dirstream'. So each call of a legacy [btrfs_]open_[file_or_]dir() helper is replaced by a call to the new btrfs_open_[file_or_]dir_fd() functions. These functions return only the file descriptor. Also close_file_or_dir() is not needed anymore. The first patch, introduces the new helpers; the last patch removed the unused older helpers. The intermediate patches updated the code. The 3rd patch updated also the add_seen_fsid() function. Before it stored the dirstream variable. But now it is not needed anymore. So we removed a parameter of the functions and a field in the structure. In the 8th patch, the only occurrences where 'dirstream' is used was corrected: the dirstream is computed using fdopendir(3), and the cleanup is updated according. The results is: - removed 7 functions - add 4 new functions - removed 100 lines - removed 43 occurrences of an unused 'dirstream' variable. BR G.Baroncelli *** BLURB HERE *** Goffredo Baroncelli (9): Killing dirstream: add helpers Killing dirstream: replace btrfs_open_dir with btrfs_open_dir_fd Killing dirstream: replace btrfs_open_dir with btrfs_open_dir_fd Killing dirstream: replace open_path_or_dev_mnt with btrfs_open_mnt_fd Killing dirstream: replace open_file_or_dir3 with btrfs_open_fd2 Killing dirstream: replace btrfs_open_file_or_dir with btrfs_open_file_or_dir_fd Killing dirstream: replace open_file_or_dir with btrfs_open_fd2 Killing dirstream: remove open_file_or_dir3 from du_add_file Killing dirstream: remove unused functions cmds/balance.c | 27 ++++----- cmds/device.c | 26 ++++---- cmds/filesystem-du.c | 18 +++++- cmds/filesystem-usage.c | 5 +- cmds/filesystem.c | 26 ++++---- cmds/inspect.c | 35 +++++------ cmds/property.c | 5 +- cmds/qgroup.c | 29 ++++----- cmds/quota.c | 16 +++-- cmds/replace.c | 17 +++--- cmds/scrub.c | 15 ++--- cmds/subvolume-list.c | 5 +- cmds/subvolume.c | 44 ++++++-------- common/device-scan.c | 6 +- common/device-scan.h | 4 +- common/open-utils.c | 127 ++++++++++++---------------------------- common/open-utils.h | 13 ++-- common/utils.c | 5 +- 18 files changed, 164 insertions(+), 259 deletions(-)