Message ID | 20160217230900.GP17997@ZenIV.linux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Feb 17, 2016 at 11:09:00PM +0000, Al Viro wrote: > However, that doesn't prevent all unpleasantness here - giving up just as > it's being copied to userland and going into restart. Ho-hum... How about > the following: > * move increment of op->attempts into the same place where we > set "given up" > * in addition to the check for "given up" in the request-picking loop > (as above), fetch op->attempts before dropping op->lock > * after having retaken op->lock (after copy_to_user()) recheck > op->attempts instead of checking for "given up". Crap... There's a similar problem on the other end - in orangefs_devreq_write_iter() between the time when op has been fetched from the hash and the time we finish copying reply from userland. Same kind of "what it clear_... gets all way through and we resubmit it before we get around to checking the given_up flag" problem... Let me think a bit... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/orangefs/devorangefs-req.c b/fs/orangefs/devorangefs-req.c index b27ed1c..1938d55 100644 --- a/fs/orangefs/devorangefs-req.c +++ b/fs/orangefs/devorangefs-req.c @@ -109,6 +109,7 @@ static ssize_t orangefs_devreq_read(struct file *file, static __s32 magic = ORANGEFS_DEVREQ_MAGIC; struct orangefs_kernel_op_s *cur_op = NULL; unsigned long ret; + int attempts; /* We do not support blocking IO. */ if (!(file->f_flags & O_NONBLOCK)) { @@ -133,7 +134,7 @@ restart: __s32 fsid; /* This lock is held past the end of the loop when we break. */ spin_lock(&op->lock); - if (unlikely(op_state_purged(op))) { + if (unlikely(op_state_purged(op) || op_state_given_up(op))) { spin_unlock(&op->lock); continue; } @@ -207,6 +208,7 @@ restart: list_del_init(&cur_op->list); get_op(op); spin_unlock(&orangefs_request_list_lock); + attempts = op->attempts; spin_unlock(&cur_op->lock); @@ -227,7 +229,8 @@ restart: spin_lock(&htable_ops_in_progress_lock); spin_lock(&cur_op->lock); - if (unlikely(op_state_given_up(cur_op))) { + if (unlikely(cur_op->attempts != attempts)) { + /* given up just as we copied to userland */ spin_unlock(&cur_op->lock); spin_unlock(&htable_ops_in_progress_lock); op_release(cur_op); diff --git a/fs/orangefs/waitqueue.c b/fs/orangefs/waitqueue.c index d980240..cc43ac8 100644 --- a/fs/orangefs/waitqueue.c +++ b/fs/orangefs/waitqueue.c @@ -139,7 +139,6 @@ retry_servicing: op->downcall.status = ret; /* retry if operation has not been serviced and if requested */ if (ret == -EAGAIN) { - op->attempts++; timeout = op_timeout_secs * HZ; gossip_debug(GOSSIP_WAIT_DEBUG, "orangefs: tag %llu (%s)" @@ -208,6 +207,7 @@ static void orangefs_clean_up_interrupted_operation(struct orangefs_kernel_op_s * Called with op->lock held. */ op->op_state |= OP_VFS_STATE_GIVEN_UP; + op->attempts++; if (op_state_waiting(op)) { /*