Message ID | a602433c-ec36-a607-e1bc-6e532e3ebaca@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v5] fsstress: add renameat2 support | expand |
On Tue, Oct 22, 2019 at 08:19:37PM +0800, kaixuxia wrote: > Support the renameat2 syscall in fsstress. > > Signed-off-by: kaixuxia <kaixuxia@tencent.com> > --- > Changes in v5: > - Fix the RENAME_EXCHANGE flist fents swap problem. > > ltp/fsstress.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 169 insertions(+), 33 deletions(-) > > diff --git a/ltp/fsstress.c b/ltp/fsstress.c > index 51976f5..7c59f2d 100644 > --- a/ltp/fsstress.c > +++ b/ltp/fsstress.c ... > @@ -4269,16 +4367,31 @@ rename_f(int opno, long r) > oldid = fep->id; > fix_parent(oldid, id); > } > - del_from_flist(flp - flist, fep - flp->fents); > - add_to_flist(flp - flist, id, parid, xattr_counter); > + > + if (mode == RENAME_WHITEOUT) { > + add_to_flist(FT_DEV, fep->id, fep->parent, 0); > + del_from_flist(flp - flist, fep - flp->fents); > + add_to_flist(flp - flist, id, parid, xattr_counter); > + } else if (mode == RENAME_EXCHANGE) { > + if (dflp - flist == FT_DIR) { > + oldid = dfep->id; > + fix_parent(oldid, fep->id); > + } > + swap_flist_fents(flp - flist, fep - flp->fents, > + dflp - flist, dfep - dflp->fents); Hmm.. sorry, but this is still a little confusing. One thing I realized when running this is that the id correlates with filename and the filename correlates to type (i.e., fN for files, cN for devs, dN for dirs, etc.). This means that we can now end up doing something like this: 0/8: rename(EXCHANGE) c4 to f5 0 0/8: rename source entry: id=5,parent=-1 0/8: rename target entry: id=5,parent=-1 ... which leaves an 'f5' device node and 'c4' regular file. Because of this, I'm wondering if we should just restrict rexchange to files of the same type and keep this simple. That means we would use the file type of the source file when looking up a destination to exchange with (instead of FT_ANY). With regard to fixing up the flist, this leaves two general cases: - Between two non-dirs: REXCHANGE f0 <-> d3/f5 The id -> parent relationship actually hasn't changed because both file entries still exist just as before the call. We've basically just swapped inodes from the directory tree perspective. This means xattr_count needs to be swapped between the entries. - Between two dirs: REXCHANGE d1 <-> d2/d3 I think the same thing applies as above with regard to the parent ids of the directories themselves. E.g., d3 is still under d2, it just now needs the xattr_count from the old d1 and vice versa. Additionally, all of the children of d2/d3 are now under d1 and vice versa, so those parent ids need to be swapped. That said, we can't just call fix_parent() to swap all parentid == 1 to 3 and then repeat for 3 -> 1 because that would put everything under 1. Instead, it seems like we actually need a single fix_parent() sweep to change all 1 -> 3 and 3 -> 1 parent ids in a single pass. Moving on to RWHITEOUT, the above means that we leave a dev node around with whatever the name of the source file was. That implies we should restrict RWHITEOUT to device nodes if we want to maintain filelist/filename integrity. The immediate question is: would that allow the associated fstest to still reproduce the deadlock problem? I think it should, but we should confirm that (i.e., the test now needs to do '-fmknod=NN' instead of '-fcreat=NN'). Thoughts? Does that all sound reasonable/correct or have I misinterpreted things? Finally, given the complexity disparity between the two operations, at this point I'd suggest to split this into two patches (one for general renameat2 support + rwhiteout, another for rexchange support on top). Brian > + } else { > + del_from_flist(flp - flist, fep - flp->fents); > + add_to_flist(flp - flist, id, parid, xattr_counter); > + } > } > if (v) { > - printf("%d/%d: rename %s to %s %d\n", procid, opno, f.path, > + printf("%d/%d: rename(%s) %s to %s %d\n", procid, > + opno, translate_renameat2_flags(mode), f.path, > newf.path, e); > if (e == 0) { > - printf("%d/%d: rename del entry: id=%d,parent=%d\n", > + printf("%d/%d: rename source entry: id=%d,parent=%d\n", > procid, opno, fep->id, fep->parent); > - printf("%d/%d: rename add entry: id=%d,parent=%d\n", > + printf("%d/%d: rename target entry: id=%d,parent=%d\n", > procid, opno, id, parid); > } > } > @@ -4287,6 +4400,29 @@ rename_f(int opno, long r) > } > > void > +rename_f(int opno, long r) > +{ > + do_renameat2(opno, r, 0); > +} > +void > +rnoreplace_f(int opno, long r) > +{ > + do_renameat2(opno, r, RENAME_NOREPLACE); > +} > + > +void > +rexchange_f(int opno, long r) > +{ > + do_renameat2(opno, r, RENAME_EXCHANGE); > +} > + > +void > +rwhiteout_f(int opno, long r) > +{ > + do_renameat2(opno, r, RENAME_WHITEOUT); > +} > + > +void > resvsp_f(int opno, long r) > { > int e; > -- > 1.8.3.1 > > -- > kaixuxia
On 2019/10/23 21:01, Brian Foster wrote: > On Tue, Oct 22, 2019 at 08:19:37PM +0800, kaixuxia wrote: >> Support the renameat2 syscall in fsstress. >> >> Signed-off-by: kaixuxia <kaixuxia@tencent.com> >> --- >> Changes in v5: >> - Fix the RENAME_EXCHANGE flist fents swap problem. >> >> ltp/fsstress.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++---------- >> 1 file changed, 169 insertions(+), 33 deletions(-) >> >> diff --git a/ltp/fsstress.c b/ltp/fsstress.c >> index 51976f5..7c59f2d 100644 >> --- a/ltp/fsstress.c >> +++ b/ltp/fsstress.c > ... >> @@ -4269,16 +4367,31 @@ rename_f(int opno, long r) >> oldid = fep->id; >> fix_parent(oldid, id); >> } >> - del_from_flist(flp - flist, fep - flp->fents); >> - add_to_flist(flp - flist, id, parid, xattr_counter); >> + >> + if (mode == RENAME_WHITEOUT) { >> + add_to_flist(FT_DEV, fep->id, fep->parent, 0); >> + del_from_flist(flp - flist, fep - flp->fents); >> + add_to_flist(flp - flist, id, parid, xattr_counter); >> + } else if (mode == RENAME_EXCHANGE) { >> + if (dflp - flist == FT_DIR) { >> + oldid = dfep->id; >> + fix_parent(oldid, fep->id); >> + } >> + swap_flist_fents(flp - flist, fep - flp->fents, >> + dflp - flist, dfep - dflp->fents); > > Hmm.. sorry, but this is still a little confusing. One thing I realized > when running this is that the id correlates with filename and the > filename correlates to type (i.e., fN for files, cN for devs, dN for > dirs, etc.). This means that we can now end up doing something like > this: > > 0/8: rename(EXCHANGE) c4 to f5 0 > 0/8: rename source entry: id=5,parent=-1 > 0/8: rename target entry: id=5,parent=-1 > I think the source entry id and parentid here are overwritten by del_from_flist() call above for all kinds of rename operations, we should show the actually values. > ... which leaves an 'f5' device node and 'c4' regular file. Because of > this, I'm wondering if we should just restrict rexchange to files of the > same type and keep this simple. That means we would use the file type of > the source file when looking up a destination to exchange with (instead > of FT_ANY). >Sounds reasonable, will do this in next version. > With regard to fixing up the flist, this leaves two general cases: > > - Between two non-dirs: REXCHANGE f0 <-> d3/f5 > > The id -> parent relationship actually hasn't changed because both file > entries still exist just as before the call. We've basically just > swapped inodes from the directory tree perspective. This means > xattr_count needs to be swapped between the entries. > > - Between two dirs: REXCHANGE d1 <-> d2/d3 > > I think the same thing applies as above with regard to the parent ids of > the directories themselves. E.g., d3 is still under d2, it just now > needs the xattr_count from the old d1 and vice versa. Additionally, all > of the children of d2/d3 are now under d1 and vice versa, so those > parent ids need to be swapped. That said, we can't just call > fix_parent() to swap all parentid == 1 to 3 and then repeat for 3 -> 1 > because that would put everything under 1. Instead, it seems like we > actually need a single fix_parent() sweep to change all 1 -> 3 and 3 -> > 1 parent ids in a single pass. > Sure. > Moving on to RWHITEOUT, the above means that we leave a dev node around > with whatever the name of the source file was. That implies we should > restrict RWHITEOUT to device nodes if we want to maintain > filelist/filename integrity. The immediate question is: would that allow > the associated fstest to still reproduce the deadlock problem? I think > it should, but we should confirm that (i.e., the test now needs to do > '-fmknod=NN' instead of '-fcreat=NN'). > Yeah, I have tested this on my vm when restricting RWHITEOUT to device nodes, the associated fstest still can reproduced the deadlock problem, and the run time is very short. > Thoughts? Does that all sound reasonable/correct or have I > misinterpreted things? > > Finally, given the complexity disparity between the two operations, at > this point I'd suggest to split this into two patches (one for general > renameat2 support + rwhiteout, another for rexchange support on top). Thanks for your comments, will do it in next version. > > Brian > >> + } else { >> + del_from_flist(flp - flist, fep - flp->fents); >> + add_to_flist(flp - flist, id, parid, xattr_counter); >> + } >> } >> if (v) { >> - printf("%d/%d: rename %s to %s %d\n", procid, opno, f.path, >> + printf("%d/%d: rename(%s) %s to %s %d\n", procid, >> + opno, translate_renameat2_flags(mode), f.path, >> newf.path, e); >> if (e == 0) { >> - printf("%d/%d: rename del entry: id=%d,parent=%d\n", >> + printf("%d/%d: rename source entry: id=%d,parent=%d\n", >> procid, opno, fep->id, fep->parent); >> - printf("%d/%d: rename add entry: id=%d,parent=%d\n", >> + printf("%d/%d: rename target entry: id=%d,parent=%d\n", >> procid, opno, id, parid); >> } >> } >> @@ -4287,6 +4400,29 @@ rename_f(int opno, long r) >> } >> >> void >> +rename_f(int opno, long r) >> +{ >> + do_renameat2(opno, r, 0); >> +} >> +void >> +rnoreplace_f(int opno, long r) >> +{ >> + do_renameat2(opno, r, RENAME_NOREPLACE); >> +} >> + >> +void >> +rexchange_f(int opno, long r) >> +{ >> + do_renameat2(opno, r, RENAME_EXCHANGE); >> +} >> + >> +void >> +rwhiteout_f(int opno, long r) >> +{ >> + do_renameat2(opno, r, RENAME_WHITEOUT); >> +} >> + >> +void >> resvsp_f(int opno, long r) >> { >> int e; >> -- >> 1.8.3.1 >> >> -- >> kaixuxia >
diff --git a/ltp/fsstress.c b/ltp/fsstress.c index 51976f5..7c59f2d 100644 --- a/ltp/fsstress.c +++ b/ltp/fsstress.c @@ -44,6 +44,38 @@ io_context_t io_ctx; #define IOV_MAX 1024 #endif +#ifndef HAVE_RENAMEAT2 +#if !defined(SYS_renameat2) && defined(__x86_64__) +#define SYS_renameat2 316 +#endif + +#if !defined(SYS_renameat2) && defined(__i386__) +#define SYS_renameat2 353 +#endif + +static int renameat2(int dfd1, const char *path1, + int dfd2, const char *path2, + unsigned int flags) +{ +#ifdef SYS_renameat2 + return syscall(SYS_renameat2, dfd1, path1, dfd2, path2, flags); +#else + errno = ENOSYS; + return -1; +#endif +} +#endif + +#ifndef RENAME_NOREPLACE +#define RENAME_NOREPLACE (1 << 0) /* Don't overwrite target */ +#endif +#ifndef RENAME_EXCHANGE +#define RENAME_EXCHANGE (1 << 1) /* Exchange source and dest */ +#endif +#ifndef RENAME_WHITEOUT +#define RENAME_WHITEOUT (1 << 2) /* Whiteout source */ +#endif + #define FILELEN_MAX (32*4096) typedef enum { @@ -85,6 +117,9 @@ typedef enum { OP_READV, OP_REMOVEFATTR, OP_RENAME, + OP_RNOREPLACE, + OP_REXCHANGE, + OP_RWHITEOUT, OP_RESVSP, OP_RMDIR, OP_SETATTR, @@ -203,6 +238,9 @@ void readlink_f(int, long); void readv_f(int, long); void removefattr_f(int, long); void rename_f(int, long); +void rnoreplace_f(int, long); +void rexchange_f(int, long); +void rwhiteout_f(int, long); void resvsp_f(int, long); void rmdir_f(int, long); void setattr_f(int, long); @@ -262,6 +300,9 @@ opdesc_t ops[] = { /* remove (delete) extended attribute */ { OP_REMOVEFATTR, "removefattr", removefattr_f, 1, 1 }, { OP_RENAME, "rename", rename_f, 2, 1 }, + { OP_RNOREPLACE, "rnoreplace", rnoreplace_f, 2, 1 }, + { OP_REXCHANGE, "rexchange", rexchange_f, 2, 1 }, + { OP_RWHITEOUT, "rwhiteout", rwhiteout_f, 2, 1 }, { OP_RESVSP, "resvsp", resvsp_f, 1, 1 }, { OP_RMDIR, "rmdir", rmdir_f, 1, 1 }, /* set attribute flag (FS_IOC_SETFLAGS ioctl) */ @@ -321,6 +362,7 @@ int execute_freq = 1; struct print_string flag_str = {0}; void add_to_flist(int, int, int, int); +void swap_flist_fents(int, int, int, int); void append_pathname(pathname_t *, char *); int attr_list_path(pathname_t *, char *, const int, int, attrlist_cursor_t *); int attr_remove_path(pathname_t *, const char *, int); @@ -354,7 +396,7 @@ int open_path(pathname_t *, int); DIR *opendir_path(pathname_t *); void process_freq(char *); int readlink_path(pathname_t *, char *, size_t); -int rename_path(pathname_t *, pathname_t *); +int rename_path(pathname_t *, pathname_t *, int); int rmdir_path(pathname_t *); void separate_pathname(pathname_t *, char *, pathname_t *); void show_ops(int, char *); @@ -758,6 +800,24 @@ add_to_flist(int ft, int id, int parent, int xattr_counter) } void +swap_flist_fents(int ft, int slot, int dft, int dslot) +{ + flist_t *ftp; + flist_t *dftp; + fent_t tmpfent; + + ftp = &flist[ft]; + dftp = &flist[dft]; + if (ft == FT_DIR) + dcache_purge(ftp->fents[slot].id); + if (dft == FT_DIR) + dcache_purge(dftp->fents[dslot].id); + tmpfent = ftp->fents[slot]; + ftp->fents[slot] = dftp->fents[dslot]; + dftp->fents[dslot] = tmpfent; +} + +void append_pathname(pathname_t *name, char *str) { int len; @@ -1519,7 +1579,7 @@ readlink_path(pathname_t *name, char *lbuf, size_t lbufsiz) } int -rename_path(pathname_t *name1, pathname_t *name2) +rename_path(pathname_t *name1, pathname_t *name2, int mode) { char buf1[NAME_MAX + 1]; char buf2[NAME_MAX + 1]; @@ -1528,14 +1588,18 @@ rename_path(pathname_t *name1, pathname_t *name2) pathname_t newname2; int rval; - rval = rename(name1->path, name2->path); + if (mode == 0) + rval = rename(name1->path, name2->path); + else + rval = renameat2(AT_FDCWD, name1->path, + AT_FDCWD, name2->path, mode); if (rval >= 0 || errno != ENAMETOOLONG) return rval; separate_pathname(name1, buf1, &newname1); separate_pathname(name2, buf2, &newname2); if (strcmp(buf1, buf2) == 0) { if (chdir(buf1) == 0) { - rval = rename_path(&newname1, &newname2); + rval = rename_path(&newname1, &newname2, mode); assert(chdir("..") == 0); } } else { @@ -1555,7 +1619,7 @@ rename_path(pathname_t *name1, pathname_t *name2) append_pathname(&newname2, "../"); append_pathname(&newname2, name2->path); if (chdir(buf1) == 0) { - rval = rename_path(&newname1, &newname2); + rval = rename_path(&newname1, &newname2, mode); assert(chdir("..") == 0); } } else { @@ -1563,7 +1627,7 @@ rename_path(pathname_t *name1, pathname_t *name2) append_pathname(&newname1, "../"); append_pathname(&newname1, name1->path); if (chdir(buf2) == 0) { - rval = rename_path(&newname1, &newname2); + rval = rename_path(&newname1, &newname2, mode); assert(chdir("..") == 0); } } @@ -4215,10 +4279,21 @@ out: free_pathname(&f); } +struct print_flags renameat2_flags [] = { + { RENAME_NOREPLACE, "NOREPLACE"}, + { RENAME_EXCHANGE, "EXCHANGE"}, + { RENAME_WHITEOUT, "WHITEOUT"}, + { -1, NULL} +}; + +#define translate_renameat2_flags(mode) \ + ({translate_flags(mode, "|", renameat2_flags);}) + void -rename_f(int opno, long r) +do_renameat2(int opno, long r, int mode) { fent_t *dfep; + flist_t *dflp; int e; pathname_t f; fent_t *fep; @@ -4234,33 +4309,56 @@ rename_f(int opno, long r) init_pathname(&f); if (!get_fname(FT_ANYm, r, &f, &flp, &fep, &v1)) { if (v1) - printf("%d/%d: rename - no filename\n", procid, opno); + printf("%d/%d: rename - no source filename\n", + procid, opno); free_pathname(&f); return; } + /* Both pathnames must exist for the RENAME_EXCHANGE */ + if (mode == RENAME_EXCHANGE) { + init_pathname(&newf); + if (!get_fname(FT_ANYm, random(), &newf, &dflp, &dfep, &v)) { + if (v) + printf("%d/%d: rename - no target filename\n", + procid, opno); + free_pathname(&newf); + free_pathname(&f); + return; + } + v |= v1; + id = dfep->id; + parid = dfep->parent; + } else { + /* + * get an existing directory for the destination parent + * directory name. + */ + if (!get_fname(FT_DIRm, random(), NULL, NULL, &dfep, &v)) + parid = -1; + else + parid = dfep->id; + v |= v1; - /* get an existing directory for the destination parent directory name */ - if (!get_fname(FT_DIRm, random(), NULL, NULL, &dfep, &v)) - parid = -1; - else - parid = dfep->id; - v |= v1; - - /* generate a new path using an existing parent directory in name */ - init_pathname(&newf); - e = generate_fname(dfep, flp - flist, &newf, &id, &v1); - v |= v1; - if (!e) { - if (v) { - (void)fent_to_name(&f, &flist[FT_DIR], dfep); - printf("%d/%d: rename - no filename from %s\n", - procid, opno, f.path); + /* + * generate a new path using an existing parent directory + * in name. + */ + init_pathname(&newf); + e = generate_fname(dfep, flp - flist, &newf, &id, &v1); + v |= v1; + if (!e) { + if (v) { + (void)fent_to_name(&f, &flist[FT_DIR], dfep); + printf("%d/%d: rename - no filename from %s\n", + procid, opno, f.path); + } + free_pathname(&newf); + free_pathname(&f); + return; } - free_pathname(&newf); - free_pathname(&f); - return; } - e = rename_path(&f, &newf) < 0 ? errno : 0; + + e = rename_path(&f, &newf, mode) < 0 ? errno : 0; check_cwd(); if (e == 0) { int xattr_counter = fep->xattr_counter; @@ -4269,16 +4367,31 @@ rename_f(int opno, long r) oldid = fep->id; fix_parent(oldid, id); } - del_from_flist(flp - flist, fep - flp->fents); - add_to_flist(flp - flist, id, parid, xattr_counter); + + if (mode == RENAME_WHITEOUT) { + add_to_flist(FT_DEV, fep->id, fep->parent, 0); + del_from_flist(flp - flist, fep - flp->fents); + add_to_flist(flp - flist, id, parid, xattr_counter); + } else if (mode == RENAME_EXCHANGE) { + if (dflp - flist == FT_DIR) { + oldid = dfep->id; + fix_parent(oldid, fep->id); + } + swap_flist_fents(flp - flist, fep - flp->fents, + dflp - flist, dfep - dflp->fents); + } else { + del_from_flist(flp - flist, fep - flp->fents); + add_to_flist(flp - flist, id, parid, xattr_counter); + } } if (v) { - printf("%d/%d: rename %s to %s %d\n", procid, opno, f.path, + printf("%d/%d: rename(%s) %s to %s %d\n", procid, + opno, translate_renameat2_flags(mode), f.path, newf.path, e); if (e == 0) { - printf("%d/%d: rename del entry: id=%d,parent=%d\n", + printf("%d/%d: rename source entry: id=%d,parent=%d\n", procid, opno, fep->id, fep->parent); - printf("%d/%d: rename add entry: id=%d,parent=%d\n", + printf("%d/%d: rename target entry: id=%d,parent=%d\n", procid, opno, id, parid); } } @@ -4287,6 +4400,29 @@ rename_f(int opno, long r) } void +rename_f(int opno, long r) +{ + do_renameat2(opno, r, 0); +} +void +rnoreplace_f(int opno, long r) +{ + do_renameat2(opno, r, RENAME_NOREPLACE); +} + +void +rexchange_f(int opno, long r) +{ + do_renameat2(opno, r, RENAME_EXCHANGE); +} + +void +rwhiteout_f(int opno, long r) +{ + do_renameat2(opno, r, RENAME_WHITEOUT); +} + +void resvsp_f(int opno, long r) { int e;
Support the renameat2 syscall in fsstress. Signed-off-by: kaixuxia <kaixuxia@tencent.com> --- Changes in v5: - Fix the RENAME_EXCHANGE flist fents swap problem. ltp/fsstress.c | 202 +++++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 169 insertions(+), 33 deletions(-)