Message ID | CAOg9mSSzJZiv=qBTuLWaEF+FJcVA=UhXtxFPwxiYG8mkn-Rj7Q@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Mar 11, 2016 at 03:18:57PM -0500, Mike Marshall wrote: > Greetings... > > The Orangefs for-next tree is: > > git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux.git > for-next > > I did a test merge (just locally, not pushed out) of Orangefs:for-next > and v4.5-rc7 so I could test out how I think I need to patch for > the follow_link -> get_link change, the diff is below. > > On Monday next, assuming that v4.5 is finalized this weekend, > I plan to do a actual merge with v4.5, apply the get_link patch > and push that to Orangefs:for-next. > > Hi Al <g>... might we get an ACK this time around? You do realize that it will mean fun few weeks post-merge fixing the rest of problems, right? FWIW, I think that right now it *is* at the state where it such fixing is feasible, so modulo that... As far as I can see, waiting-related logics should be solid by now, ditto for lifetime rules; sanitizing the input... listxattr still does need fixing (feed it a negative in ->downcall.resp.listxattr.lengths[0] and watch Bad Things(tm) happen; no idea why would anyone go for fs/orangefs/downcall.h:82: __s32 lengths[ORANGEFS_MAX_XATTR_LISTLEN]; for representing string lengths in the first place, but that's what you've got there and no sanity checks are done on it beyond if (total + new_op->downcall.resp.listxattr.lengths[i] > size) goto done; which is not enough - not with total and size being ssize_t and ...lengths[] - signed 32bit). The logics around maintaining the list of orangefs superblocks (add/remove/ traverse) needs fixing; right now ioctl(..., ORANGEFS_DEV_REMOUNT_ALL) will walk through it with only request_mutex held. Both insertion and removal are protected only by orangefs_superblocks_lock, and removal is insane - struct list_head *tmp_safe = NULL; \ struct orangefs_sb_info_s *orangefs_sb = NULL; \ \ spin_lock(&orangefs_superblocks_lock); \ list_for_each_safe(tmp, tmp_safe, &orangefs_superblocks) { \ orangefs_sb = list_entry(tmp, \ struct orangefs_sb_info_s, \ list); \ if (orangefs_sb && (orangefs_sb->sb == sb)) { \ gossip_debug(GOSSIP_SUPER_DEBUG, \ "Removing SB %p from orangefs superblocks\n", \ orangefs_sb); \ list_del(&orangefs_sb->list); \ break; \ } \ } \ spin_unlock(&orangefs_superblocks_lock); \ list_entry is never NULL, for starters, and since there is a pointer back from superblock to that orangefs_sb_info, there's no reason to walk the entire list to find one. BTW, both add_orangefs_sb() and remove_orangefs_sb() should be taken to their sole users. Sanity aside, there's really no lock in common for list modifiers and list walker I'd mentioned above. FWIW, I would make orangefs_remount() take struct orangefs_sb_info instead of struct super_block and flipped the order of operations in orangefs_kill_sb() - kill_anon_super() *first*, then remove from the list, then tell the userland that it's going away (i.e. call orangefs_unmount_sb()). request_mutex in the last one would, at least, prevent freeing the sucker before orangefs_remount() is done with it. Walking the list and calling orangefs_remount() on everything would still need care - you'd need to hold orangefs_superblocks_lock, drop it for actual calls of orangefs_remount() and have list removal preserve the forward pointer. That's probably the worst remaining locking issue I see in there. Doable, if not pleasant... IIRC, there also had been some unpleasantness with getattr messing ->i_mode halfway through... <checks> Yes - copy_attributes_to_inode() will be called, and do inode->i_mode = orangefs_inode_perms(attrs); ... inode->i_mode |= S_IFLNK; ... strncpy(orangefs_inode->link_target, symname, ORANGEFS_NAME_MAX); If nothing else, *another* stat(2) racing with this one could pick the intermediate value of ->i_mode and proceed to report it to userland. Another problem is overwriting the symlink body; that can get very unpleasant, since it might be traversed by another syscall right at that moment. Any change of a symlink body means "we'd missed it going stale"; there is no way to change a symlink contents without removing it and creating a new one. Should anything other than orangefs_iget() even bother copying it? The same goes for inode type changes, of course (regular vs. directory vs. symlink, etc.). Speaking of orangefs_iget(), orangefs_set_inode() is pointlessly paranoid. Not a bug per se, but struct orangefs_inode_s *orangefs_inode = NULL; /* Make sure that we have sane parameters */ if (!data || !inode) return 0; orangefs_inode = ORANGEFS_I(inode); if (!orangefs_inode) return 0; is all wrong - 'data' is the last argument passed to iget5_locked (i.e. 'ref' of orangefs_iget()) and that's always an address of either a local variable or of a field in a large structure, and not even the first one; 'inode' is never NULL - it's the address of struct inode the caller is about to insert into the hash chain; ORANGEFS_I() is container_of(), so it's not going to be NULL either. I'll need to look through the archived threads to see if there's anything else left; IIRC, debugfs-related code had seriously nasty issues in case of allocation failures, but those were fairly isolated. I'll read through the archive tomorrow and see if there's anything else mentioned and not dealt with; I don't remember anything really bad, but it had been well over a hundred mails starting about half a year ago; I sure as hell do not remember every tangential subthread in all of that, so I'll need to recheck. I _think_ that all remaining issues can be quickly dealt with, and the code has zero impact on the rest of the kernel. I wouldn't risk putting it into -final without fixups, but as for the merge schedule... either merge it before -rc1 and fix it up by -rc3 or so, or fix it during the window and merge at around -rc2 - I'm fine with either variant. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> either merge it > before -rc1 and fix it up by -rc3 or so, or fix it during the > window and merge at around -rc2 - I'm fine with either > variant. We've kept a list we made from all those mail messages so we could check off things we've tried to address, I was looking at it yesterday and I know it is not up-to-date, but we'll work to get it that way. The second option might be safer unless you help us again, I don't want to sign a rubber check to Linus. -Mike On Fri, Mar 11, 2016 at 4:47 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Fri, Mar 11, 2016 at 03:18:57PM -0500, Mike Marshall wrote: >> Greetings... >> >> The Orangefs for-next tree is: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux.git >> for-next >> >> I did a test merge (just locally, not pushed out) of Orangefs:for-next >> and v4.5-rc7 so I could test out how I think I need to patch for >> the follow_link -> get_link change, the diff is below. >> >> On Monday next, assuming that v4.5 is finalized this weekend, >> I plan to do a actual merge with v4.5, apply the get_link patch >> and push that to Orangefs:for-next. >> >> Hi Al <g>... might we get an ACK this time around? > > You do realize that it will mean fun few weeks post-merge fixing the rest of > problems, right? FWIW, I think that right now it *is* at the state where it > such fixing is feasible, so modulo that... > > As far as I can see, waiting-related logics should be solid by now, ditto > for lifetime rules; sanitizing the input... listxattr still does need fixing > (feed it a negative in ->downcall.resp.listxattr.lengths[0] and watch Bad > Things(tm) happen; no idea why would anyone go for > fs/orangefs/downcall.h:82: __s32 lengths[ORANGEFS_MAX_XATTR_LISTLEN]; > for representing string lengths in the first place, but that's what you've > got there and no sanity checks are done on it beyond > if (total + new_op->downcall.resp.listxattr.lengths[i] > size) > goto done; > which is not enough - not with total and size being ssize_t and ...lengths[] - > signed 32bit). > > The logics around maintaining the list of orangefs superblocks (add/remove/ > traverse) needs fixing; right now ioctl(..., ORANGEFS_DEV_REMOUNT_ALL) will > walk through it with only request_mutex held. Both insertion and removal > are protected only by orangefs_superblocks_lock, and removal is insane - > struct list_head *tmp_safe = NULL; \ > struct orangefs_sb_info_s *orangefs_sb = NULL; \ > \ > spin_lock(&orangefs_superblocks_lock); \ > list_for_each_safe(tmp, tmp_safe, &orangefs_superblocks) { \ > orangefs_sb = list_entry(tmp, \ > struct orangefs_sb_info_s, \ > list); \ > if (orangefs_sb && (orangefs_sb->sb == sb)) { \ > gossip_debug(GOSSIP_SUPER_DEBUG, \ > "Removing SB %p from orangefs superblocks\n", \ > orangefs_sb); \ > list_del(&orangefs_sb->list); \ > break; \ > } \ > } \ > spin_unlock(&orangefs_superblocks_lock); \ > list_entry is never NULL, for starters, and since there is a pointer back > from superblock to that orangefs_sb_info, there's no reason to walk the entire > list to find one. BTW, both add_orangefs_sb() and remove_orangefs_sb() should > be taken to their sole users. > > Sanity aside, there's really no lock in common for list modifiers and list > walker I'd mentioned above. FWIW, I would make orangefs_remount() > take struct orangefs_sb_info instead of struct super_block and flipped the > order of operations in orangefs_kill_sb() - kill_anon_super() *first*, then > remove from the list, then tell the userland that it's going away (i.e. > call orangefs_unmount_sb()). request_mutex in the last one would, at least, > prevent freeing the sucker before orangefs_remount() is done with it. > > Walking the list and calling orangefs_remount() on everything would still need > care - you'd need to hold orangefs_superblocks_lock, drop it for actual calls > of orangefs_remount() and have list removal preserve the forward pointer. > > That's probably the worst remaining locking issue I see in there. Doable, > if not pleasant... > > IIRC, there also had been some unpleasantness with getattr messing ->i_mode > halfway through... <checks> Yes - copy_attributes_to_inode() will be called, > and do > inode->i_mode = orangefs_inode_perms(attrs); > ... > inode->i_mode |= S_IFLNK; > ... > strncpy(orangefs_inode->link_target, > symname, > ORANGEFS_NAME_MAX); > If nothing else, *another* stat(2) racing with this one could pick the > intermediate value of ->i_mode and proceed to report it to userland. > Another problem is overwriting the symlink body; that can get very > unpleasant, since it might be traversed by another syscall right at that > moment. Any change of a symlink body means "we'd missed it going stale"; > there is no way to change a symlink contents without removing it and > creating a new one. Should anything other than orangefs_iget() even bother > copying it? The same goes for inode type changes, of course (regular > vs. directory vs. symlink, etc.). > > Speaking of orangefs_iget(), orangefs_set_inode() is pointlessly paranoid. > Not a bug per se, but > struct orangefs_inode_s *orangefs_inode = NULL; > > /* Make sure that we have sane parameters */ > if (!data || !inode) > return 0; > orangefs_inode = ORANGEFS_I(inode); > if (!orangefs_inode) > return 0; > is all wrong - 'data' is the last argument passed to iget5_locked (i.e. 'ref' > of orangefs_iget()) and that's always an address of either a local variable > or of a field in a large structure, and not even the first one; 'inode' > is never NULL - it's the address of struct inode the caller is about to > insert into the hash chain; ORANGEFS_I() is container_of(), so it's not > going to be NULL either. > > I'll need to look through the archived threads to see if there's anything > else left; IIRC, debugfs-related code had seriously nasty issues in case of > allocation failures, but those were fairly isolated. I'll read through the > archive tomorrow and see if there's anything else mentioned and not dealt > with; I don't remember anything really bad, but it had been well over > a hundred mails starting about half a year ago; I sure as hell do not > remember every tangential subthread in all of that, so I'll need to recheck. > > I _think_ that all remaining issues can be quickly dealt with, and the code > has zero impact on the rest of the kernel. I wouldn't risk putting it into > -final without fixups, but as for the merge schedule... either merge it > before -rc1 and fix it up by -rc3 or so, or fix it during the window and > merge at around -rc2 - I'm fine with either variant. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Al (and everyone)... git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux.git for-next ... is updated to v4.5, it has the follow_link -> get_link change. I worked today to clean up the debugfs (and sysfs) problems, and we'll keep ticking things off the list, perhaps I should be posting patches here for review instead automatically updating for-next when we work on an issue... We're working on putting "the list" in a place that can be viewed by everyone and edited by both Martin and myself... -Mike On Fri, Mar 11, 2016 at 5:35 PM, Mike Marshall <hubcap@omnibond.com> wrote: >> either merge it >> before -rc1 and fix it up by -rc3 or so, or fix it during the >> window and merge at around -rc2 - I'm fine with either >> variant. > > We've kept a list we made from all those mail messages > so we could check off things we've tried to address, I > was looking at it yesterday and I know it is not up-to-date, > but we'll work to get it that way. The second option > might be safer unless you help us again, I don't want > to sign a rubber check to Linus. > > -Mike > > On Fri, Mar 11, 2016 at 4:47 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: >> On Fri, Mar 11, 2016 at 03:18:57PM -0500, Mike Marshall wrote: >>> Greetings... >>> >>> The Orangefs for-next tree is: >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux.git >>> for-next >>> >>> I did a test merge (just locally, not pushed out) of Orangefs:for-next >>> and v4.5-rc7 so I could test out how I think I need to patch for >>> the follow_link -> get_link change, the diff is below. >>> >>> On Monday next, assuming that v4.5 is finalized this weekend, >>> I plan to do a actual merge with v4.5, apply the get_link patch >>> and push that to Orangefs:for-next. >>> >>> Hi Al <g>... might we get an ACK this time around? >> >> You do realize that it will mean fun few weeks post-merge fixing the rest of >> problems, right? FWIW, I think that right now it *is* at the state where it >> such fixing is feasible, so modulo that... >> >> As far as I can see, waiting-related logics should be solid by now, ditto >> for lifetime rules; sanitizing the input... listxattr still does need fixing >> (feed it a negative in ->downcall.resp.listxattr.lengths[0] and watch Bad >> Things(tm) happen; no idea why would anyone go for >> fs/orangefs/downcall.h:82: __s32 lengths[ORANGEFS_MAX_XATTR_LISTLEN]; >> for representing string lengths in the first place, but that's what you've >> got there and no sanity checks are done on it beyond >> if (total + new_op->downcall.resp.listxattr.lengths[i] > size) >> goto done; >> which is not enough - not with total and size being ssize_t and ...lengths[] - >> signed 32bit). >> >> The logics around maintaining the list of orangefs superblocks (add/remove/ >> traverse) needs fixing; right now ioctl(..., ORANGEFS_DEV_REMOUNT_ALL) will >> walk through it with only request_mutex held. Both insertion and removal >> are protected only by orangefs_superblocks_lock, and removal is insane - >> struct list_head *tmp_safe = NULL; \ >> struct orangefs_sb_info_s *orangefs_sb = NULL; \ >> \ >> spin_lock(&orangefs_superblocks_lock); \ >> list_for_each_safe(tmp, tmp_safe, &orangefs_superblocks) { \ >> orangefs_sb = list_entry(tmp, \ >> struct orangefs_sb_info_s, \ >> list); \ >> if (orangefs_sb && (orangefs_sb->sb == sb)) { \ >> gossip_debug(GOSSIP_SUPER_DEBUG, \ >> "Removing SB %p from orangefs superblocks\n", \ >> orangefs_sb); \ >> list_del(&orangefs_sb->list); \ >> break; \ >> } \ >> } \ >> spin_unlock(&orangefs_superblocks_lock); \ >> list_entry is never NULL, for starters, and since there is a pointer back >> from superblock to that orangefs_sb_info, there's no reason to walk the entire >> list to find one. BTW, both add_orangefs_sb() and remove_orangefs_sb() should >> be taken to their sole users. >> >> Sanity aside, there's really no lock in common for list modifiers and list >> walker I'd mentioned above. FWIW, I would make orangefs_remount() >> take struct orangefs_sb_info instead of struct super_block and flipped the >> order of operations in orangefs_kill_sb() - kill_anon_super() *first*, then >> remove from the list, then tell the userland that it's going away (i.e. >> call orangefs_unmount_sb()). request_mutex in the last one would, at least, >> prevent freeing the sucker before orangefs_remount() is done with it. >> >> Walking the list and calling orangefs_remount() on everything would still need >> care - you'd need to hold orangefs_superblocks_lock, drop it for actual calls >> of orangefs_remount() and have list removal preserve the forward pointer. >> >> That's probably the worst remaining locking issue I see in there. Doable, >> if not pleasant... >> >> IIRC, there also had been some unpleasantness with getattr messing ->i_mode >> halfway through... <checks> Yes - copy_attributes_to_inode() will be called, >> and do >> inode->i_mode = orangefs_inode_perms(attrs); >> ... >> inode->i_mode |= S_IFLNK; >> ... >> strncpy(orangefs_inode->link_target, >> symname, >> ORANGEFS_NAME_MAX); >> If nothing else, *another* stat(2) racing with this one could pick the >> intermediate value of ->i_mode and proceed to report it to userland. >> Another problem is overwriting the symlink body; that can get very >> unpleasant, since it might be traversed by another syscall right at that >> moment. Any change of a symlink body means "we'd missed it going stale"; >> there is no way to change a symlink contents without removing it and >> creating a new one. Should anything other than orangefs_iget() even bother >> copying it? The same goes for inode type changes, of course (regular >> vs. directory vs. symlink, etc.). >> >> Speaking of orangefs_iget(), orangefs_set_inode() is pointlessly paranoid. >> Not a bug per se, but >> struct orangefs_inode_s *orangefs_inode = NULL; >> >> /* Make sure that we have sane parameters */ >> if (!data || !inode) >> return 0; >> orangefs_inode = ORANGEFS_I(inode); >> if (!orangefs_inode) >> return 0; >> is all wrong - 'data' is the last argument passed to iget5_locked (i.e. 'ref' >> of orangefs_iget()) and that's always an address of either a local variable >> or of a field in a large structure, and not even the first one; 'inode' >> is never NULL - it's the address of struct inode the caller is about to >> insert into the hash chain; ORANGEFS_I() is container_of(), so it's not >> going to be NULL either. >> >> I'll need to look through the archived threads to see if there's anything >> else left; IIRC, debugfs-related code had seriously nasty issues in case of >> allocation failures, but those were fairly isolated. I'll read through the >> archive tomorrow and see if there's anything else mentioned and not dealt >> with; I don't remember anything really bad, but it had been well over >> a hundred mails starting about half a year ago; I sure as hell do not >> remember every tangential subthread in all of that, so I'll need to recheck. >> >> I _think_ that all remaining issues can be quickly dealt with, and the code >> has zero impact on the rest of the kernel. I wouldn't risk putting it into >> -final without fixups, but as for the merge schedule... either merge it >> before -rc1 and fix it up by -rc3 or so, or fix it during the window and >> merge at around -rc2 - I'm fine with either variant. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 11 Mar 2016, Al Viro wrote: > IIRC, there also had been some unpleasantness with getattr messing ->i_mode > halfway through... <checks> Yes - copy_attributes_to_inode() will be called, > and do > inode->i_mode = orangefs_inode_perms(attrs); > ... > inode->i_mode |= S_IFLNK; > ... > strncpy(orangefs_inode->link_target, > symname, > ORANGEFS_NAME_MAX); > If nothing else, *another* stat(2) racing with this one could pick the > intermediate value of ->i_mode and proceed to report it to userland. > Another problem is overwriting the symlink body; that can get very > unpleasant, since it might be traversed by another syscall right at that > moment. Any change of a symlink body means "we'd missed it going stale"; > there is no way to change a symlink contents without removing it and > creating a new one. Should anything other than orangefs_iget() even bother > copying it? The same goes for inode type changes, of course (regular > vs. directory vs. symlink, etc.). I thought I fixed most of that when I went over d_revalidate. But now I see that there's the case that the server attribute changes after revalidate and before the operation e.g. chmod. What happens to the inode we now know is stale? What if two processes notice it's stale at the same time? We can go ahead with the chmod even if the object type has changed since the kernel last saw it. But OrangeFS doesn't have NFS's sillyrename. So what do we do in the middle of read when we get ENOENT? We ignore POSIX and return it to userspace anyway and leave the bad inode around now. There's lots of unsanitized error codes from the server going to userspace now. Will writing i_mode in one shot fix it? -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 15 Mar 2016, Martin Brandenburg wrote: > On Fri, 11 Mar 2016, Al Viro wrote: > >> IIRC, there also had been some unpleasantness with getattr messing ->i_mode >> halfway through... <checks> Yes - copy_attributes_to_inode() will be >> called, >> and do >> inode->i_mode = orangefs_inode_perms(attrs); >> ... >> inode->i_mode |= S_IFLNK; >> ... >> strncpy(orangefs_inode->link_target, >> symname, >> ORANGEFS_NAME_MAX); >> If nothing else, *another* stat(2) racing with this one could pick the >> intermediate value of ->i_mode and proceed to report it to userland. >> Another problem is overwriting the symlink body; that can get very >> unpleasant, since it might be traversed by another syscall right at that >> moment. Any change of a symlink body means "we'd missed it going stale"; >> there is no way to change a symlink contents without removing it and >> creating a new one. Should anything other than orangefs_iget() even bother >> copying it? The same goes for inode type changes, of course (regular >> vs. directory vs. symlink, etc.). > > I thought I fixed most of that when I went over > d_revalidate. But now I see that there's the case that > the server attribute changes after revalidate and before > the operation e.g. chmod. What happens to the inode we > now know is stale? What if two processes notice it's > stale at the same time? > > We can go ahead with the chmod even if the object type > has changed since the kernel last saw it. But OrangeFS > doesn't have NFS's sillyrename. So what do we do in the > middle of read when we get ENOENT? We ignore POSIX and > return it to userspace anyway and leave the bad inode > around now. > > There's lots of unsanitized error codes from the server > going to userspace now. > > Will writing i_mode in one shot fix it? > > -- Martin > This is where I'm going with this. Not done yet, but you can see where it's going hopefully. https://github.com/martinbrandenburg/linux.git branch getattr -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 14, 2016 at 05:03:55PM -0400, Mike Marshall wrote: > Hi Al (and everyone)... > > git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux.git > for-next > > ... is updated to v4.5, it has the follow_link -> get_link change. > > I worked today to clean up the debugfs (and sysfs) problems, > and we'll keep ticking things off the list, perhaps I should be > posting patches here for review instead automatically updating > for-next when we work on an issue... > > We're working on putting "the list" in a place that can be > viewed by everyone and edited by both Martin and myself... FWIW, I've ported several cleanups + short read/write on error + lseek fixes + orangefs_superblocks locking fixes on top of that branch and pushed into vfs.git#orangefs-untested. I think it can be merged this cycle; if the fixes above really work, this + Martin's latest should leave only the handling of allocation failures in debugfs-related code. Linus, would you be OK with that thing going into -rc2? It's self-contained and I think it's in reasonably sane shape. If not for the last locking fixes, I would even suggest -rc1 with fixups of debugfs issues during the next week, but that last commit got no testing whatsoever... ;-/ I would prefer if pull request went from Mike, once he's OK with the whole set, but as far as I'm concerned the material in his #for-next + my pile above + Martin's branch (getattr fixes) should be OK for mainline merge. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wow... the superblock problem is something we haven't worked on yet... thanks! Martin's new code caused some xfstests regressions, but we got past all but the generic/028 regression... I don't know if Martin worked on that today, I wasn't in the office. I had fixed up what I saw wrong in debugfs, so I'll have to look again for the allocation failures... I haven't pushed Martin's re-worked code to kernel.org yet, since we were working through the couple of regressions, have you looked at it on his github repo, and it doesn't conflict with your patches? I'll look and see if I can get your patches to load on top of Martin's patches and get that pushed to kernel.org. I leave before dawn on Sunday morning for that Linux Foundation summit in Lake Tahoe, so if we get this thing merged now (I hope <g>) we'll have to continue to polish it through the rc's, but that's what they're there for, right? -Mike On Fri, Mar 25, 2016 at 8:21 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Mon, Mar 14, 2016 at 05:03:55PM -0400, Mike Marshall wrote: >> Hi Al (and everyone)... >> >> git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux.git >> for-next >> >> ... is updated to v4.5, it has the follow_link -> get_link change. >> >> I worked today to clean up the debugfs (and sysfs) problems, >> and we'll keep ticking things off the list, perhaps I should be >> posting patches here for review instead automatically updating >> for-next when we work on an issue... >> >> We're working on putting "the list" in a place that can be >> viewed by everyone and edited by both Martin and myself... > > FWIW, I've ported several cleanups + short read/write on error + lseek > fixes + orangefs_superblocks locking fixes on top of that branch and pushed > into vfs.git#orangefs-untested. > > I think it can be merged this cycle; if the fixes above really work, this > + Martin's latest should leave only the handling of allocation failures in > debugfs-related code. > > Linus, would you be OK with that thing going into -rc2? It's self-contained > and I think it's in reasonably sane shape. If not for the last locking fixes, > I would even suggest -rc1 with fixups of debugfs issues during the next week, > but that last commit got no testing whatsoever... ;-/ > > I would prefer if pull request went from Mike, once he's OK with the whole > set, but as far as I'm concerned the material in his #for-next + my pile above > + Martin's branch (getattr fixes) should be OK for mainline merge. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 25, 2016 at 05:28:33PM -0700, Linus Torvalds wrote: > Gaah. I'm going to be traveling starting tomorrow night (red-eye) and I was > going to cut rc1 tomorrow and then really hope for a calm rc2 week. > > At the same time, is live to get this finally merged, about a year after I > originally thought were would. So I guess I'll take it, even if I hate the > timing. Umm... IIRC, there had been precedents when self-contained driver got merged later than -rc1. Might make more sense than rushing the merge in the last day of the window, especially since you are travelling... > There are zero changes outside orangefs itself, and obviously the > Makefile/KConfig file to get it all enabled? Yes. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
There's the orangefs.txt in Documentation. And I'm listed as a maintainer, I forget what file that's in... But you're right, we mostly don't touch anything out of fs/orangefs... -Mike On Fri, Mar 25, 2016 at 8:28 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Mar 25, 2016 5:21 PM, "Al Viro" <viro@zeniv.linux.org.uk> wrote: >> >> Linus, would you be OK with that thing going into -rc2? It's >> self-contained >> and I think it's in reasonably sane shape. If not for the last locking >> fixes, >> I would even suggest -rc1 with fixups of debugfs issues during the next >> week, >> but that last commit got no testing whatsoever... ;-/ > > Gaah. I'm going to be traveling starting tomorrow night (red-eye) and I was > going to cut rc1 tomorrow and then really hope for a calm rc2 week. > > At the same time, is live to get this finally merged, about a year after I > originally thought were would. So I guess I'll take it, even if I hate the > timing. > > There are zero changes outside orangefs itself, and obviously the > Makefile/KConfig file to get it all enabled? > > Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
If Al's patches load on top of Martin's patches without conflict, I can probably get it all merged and minimally tested before noon my time (eastern US) tomorrow... that's probably late for where y'all are though... -Mike On Fri, Mar 25, 2016 at 9:01 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Fri, Mar 25, 2016 at 05:28:33PM -0700, Linus Torvalds wrote: >> Gaah. I'm going to be traveling starting tomorrow night (red-eye) and I was >> going to cut rc1 tomorrow and then really hope for a calm rc2 week. >> >> At the same time, is live to get this finally merged, about a year after I >> originally thought were would. So I guess I'll take it, even if I hate the >> timing. > > Umm... IIRC, there had been precedents when self-contained driver got merged > later than -rc1. Might make more sense than rushing the merge in the last > day of the window, especially since you are travelling... > >> There are zero changes outside orangefs itself, and obviously the >> Makefile/KConfig file to get it all enabled? > > Yes. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I got Al's patches merged with Martin's patches and only one pretty simple conflict to fix. In testing, most things seem to work right, but at least one important thing is busted: umount... unmounting the filesystem just gets the kernel into a forever loop, "ERROR: fs_mount_pending 147936488" just goes on forever until I interrupt the umount command: Mar 25 23:42:00 be1 kernel: [ 156.990737] Releasing OP (ffff8800189a0000: 103) Mar 25 23:42:00 be1 kernel: [ 156.991366] orangefs_kill_sb: called Mar 25 23:42:00 be1 kernel: [ 156.991401] orangefs_destroy_inode: deallocated ffff880018998000 destroying inode 00001000-0000-0000-0000-000000000000 Mar 25 23:42:00 be1 kernel: [ 156.991414] orangefs_unmount_sb called on sb ffff880018810000 Mar 25 23:42:00 be1 kernel: [ 156.991416] Alloced OP (ffff8800189a0000: 104 OP_FS_UMOUNT) Mar 25 23:42:00 be1 kernel: [ 156.991417] Attempting ORANGEFS Unmount via host tcp://be1:3334/orangefs Mar 25 23:42:00 be1 kernel: [ 156.991418] service_operation: orangefs_fs_umount op:ffff8800189a0000: process:umount: pid:1124: Mar 25 23:42:00 be1 kernel: [ 156.991419] service_operation: op:OP_FS_UMOUNT: op_state:1: process:umount: Mar 25 23:42:00 be1 kernel: [ 156.991435] orangefs: skipping op tag 104 OP_FS_UMOUNT Mar 25 23:42:00 be1 kernel: [ 156.991435] orangefs: ERROR: fs_mount_pending 147936488 Mar 25 23:42:00 be1 kernel: [ 156.991440] orangefs: skipping op tag 104 OP_FS_UMOUNT The github repo has all the patches, I didn't think I should push it to the orangefs for-next branch like this. https://github.com/hubcapsc/linux/tree/current I'll look to see if I can see it, I guess it has something to do with Al's superblock re-do... -Mike On Fri, Mar 25, 2016 at 9:11 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Mar 25, 2016 18:01, "Al Viro" <viro@zeniv.linux.org.uk> wrote: >> >> Umm... IIRC, there had been precedents when self-contained driver got >> merged >> later than -rc1. > > I do it if there is some reason for it, yes. > > In particular, for things like consumer hardware that has not had a driver > for it, I'll merge new drivers later, because missing one release nasty be a > big deal if some average-Joe machine just doesn't work. But even then it's > not just "random driver", but something that keeps that machine from working > entirely - disk or network or GPU or USB or something core like that. > > So it's not just "new driver that cannot cause a regression". It's also > about the driver being important to people who can't reasonably be expected > to compile their own kernel.. > > There is little reason to do an expedited out-of-the-merge-window merge for > something like orangefs, though. > > Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c index 4e923ec..50a2172 100644 --- a/fs/orangefs/inode.c +++ b/fs/orangefs/inode.c @@ -276,9 +276,12 @@ int orangefs_getattr(struct vfsmount *mnt, ret = orangefs_inode_getattr(inode, ORANGEFS_ATTR_SYS_ALL_NOHINT, 0); if (ret == 0) { generic_fillattr(inode, kstat); + /* override block size reported to stat */ orangefs_inode = ORANGEFS_I(inode); kstat->blksize = orangefs_inode->blksize; + + inode->i_link = ORANGEFS_I(dentry->d_inode)->link_target; } else { /* assume an I/O error and flag inode as bad */ gossip_debug(GOSSIP_INODE_DEBUG, diff --git a/fs/orangefs/symlink.c b/fs/orangefs/symlink.c index 2b8541a..6418dd6 100644 --- a/fs/orangefs/symlink.c +++ b/fs/orangefs/symlink.c @@ -8,22 +8,9 @@ #include "orangefs-kernel.h" #include "orangefs-bufmap.h" -static const char *orangefs_follow_link(struct dentry *dentry, void **cookie) -{ - char *target = ORANGEFS_I(dentry->d_inode)->link_target; - - gossip_debug(GOSSIP_INODE_DEBUG, - "%s: called on %s (target is %p)\n", - __func__, (char *)dentry->d_name.name, target); - - *cookie = target; - - return target; -} - struct inode_operations orangefs_symlink_inode_operations = { .readlink = generic_readlink, - .follow_link = orangefs_follow_link, + .get_link = simple_get_link, .setattr = orangefs_setattr, .getattr = orangefs_getattr, .listxattr = orangefs_listxattr,