From patchwork Thu Apr 16 18:07:15 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Olga Kornievskaia <aglo@umich.edu>
X-Patchwork-Id: 6227351
Return-Path: <linux-nfs-owner@kernel.org>
X-Original-To: patchwork-linux-nfs@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id 21645BF4A6
	for <patchwork-linux-nfs@patchwork.kernel.org>;
	Thu, 16 Apr 2015 18:07:21 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id 27469200E6
	for <patchwork-linux-nfs@patchwork.kernel.org>;
	Thu, 16 Apr 2015 18:07:20 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.kernel.org (Postfix) with ESMTP id EA61F200CC
	for <patchwork-linux-nfs@patchwork.kernel.org>;
	Thu, 16 Apr 2015 18:07:18 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752868AbbDPSHR (ORCPT
	<rfc822;patchwork-linux-nfs@patchwork.kernel.org>);
	Thu, 16 Apr 2015 14:07:17 -0400
Received: from mail-ig0-f182.google.com ([209.85.213.182]:34923 "EHLO
	mail-ig0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752756AbbDPSHQ convert rfc822-to-8bit (ORCPT
	<rfc822; linux-nfs@vger.kernel.org>); Thu, 16 Apr 2015 14:07:16 -0400
Received: by igbyr2 with SMTP id yr2so84935401igb.0
	for <linux-nfs@vger.kernel.org>; Thu, 16 Apr 2015 11:07:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=gmail.com; s=20120113;
	h=mime-version:sender:in-reply-to:references:date:message-id:subject
	:from:to:cc:content-type:content-transfer-encoding;
	bh=lpYqjw1Nts/i+iv29GkkAx3z5Iw60r5XdeODOIscROA=;
	b=ZbKBukTrFvJcPOmx7BLKajrIVZuOb4kChBLVYL2mOstdfy6YROu0D4OW8FDNvXjTFw
	gbL1KE74RceRUUOum8epfcwN6R+a4mrR9QpcLLuoGEuJBNPpRim1YNstXwrGQJsUWDPB
	wircKuNDgyd+2+2eLO4RME8kt0suC7Gqf7JvlVps5vmJ0LD6p5dunlq81AkYfIrYAla+
	D+AgOOWqW9LdZomX4HwwFLAv9ySAoEXNwXhE/k6zFRiI+SaMIW0yeio0mWIEkZYFsxDV
	KuFnFxcPA6VENyyZW9t4fSbe9koaDsaXndIKwowCo5JYeIm2Kfw5AEXMlcOsp9DzcsrF
	3Hlw==
MIME-Version: 1.0
X-Received: by 10.107.13.11 with SMTP id 11mr24583737ion.70.1429207635981;
	Thu, 16 Apr 2015 11:07:15 -0700 (PDT)
Received: by 10.107.131.214 with HTTP; Thu, 16 Apr 2015 11:07:15 -0700 (PDT)
In-Reply-To: 
 <CAN-5tyEx_Pp0q58EiYSLLAZZ1cR4Tn9=gkD7b_pubsZdXymsFg@mail.gmail.com>
References: 
 <CAN-5tyE60A+2h6My0Ebb=5j4DbVifDBuG6CjymCRsRefNTMd=w@mail.gmail.com>
	<CAHQdGtSTc-34R_C0TwYmwk8KdQFadxQbhgmoTxzhWKhsYjiFww@mail.gmail.com>
	<CAN-5tyEx_Pp0q58EiYSLLAZZ1cR4Tn9=gkD7b_pubsZdXymsFg@mail.gmail.com>
Date: Thu, 16 Apr 2015 14:07:15 -0400
X-Google-Sender-Auth: 5APVo2BqdZkTE8dWjE8QEcXECo0
Message-ID: 
 <CAN-5tyGSEURn62cawEbudhQ5gsEdRD+HB7qKjYRS-3L-OLsPrQ@mail.gmail.com>
Subject: Re: is receiving BAD_STATEID during IO on delegated stateid
	unrecoverable?
From: Olga Kornievskaia <aglo@umich.edu>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: linux-nfs <linux-nfs@vger.kernel.org>
Sender: linux-nfs-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-nfs.vger.kernel.org>
X-Mailing-List: linux-nfs@vger.kernel.org
X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	RCVD_IN_DNSWL_HI,T_DKIM_INVALID,T_RP_MATCHES_RCVD,UNPARSEABLE_RELAY
	autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Which of the two solutions would you prefer as fix?

Problem statement: SETATTR is sent with a delegation stateid after the
original open was closed so we have no open state. When the server
fails the SETATTR with stateid-type of error BAD_STATEID or
ADMIN_REVOKED, client fails to recover because it has no open state to
initiate recovery and instead fails with EIO.

Solution #1: When we need to send a SETATTR and we have no state, use
zero stateid instead. Disadvantage of this approach is that if the
delegation state is actually valid, then it'll force a delegation to
be recalled by the server.

On Wed, Apr 15, 2015 at 2:27 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> Hi Trond,
>
> I'm resurrecting an old client received "BAD_STATEID" using delegation
> stateid on some operation thread.... If client used a delegation
> stateid on a SETATTR (i.e. for open truncate) and received this error,
> does this also lead to unrecoverable state or do you think client
> should try recover? I can see the same argument that if state was
> revoked another client could have change the file already.
>
> If you think it's recoverable, there is a bug in the client because it
> doesn't recover. I can explain why but there is no need if this is an
> acceptable behavior.
>
> Thanks.
>
> On Thu, Nov 20, 2014 at 4:14 PM, Trond Myklebust
> <trond.myklebust@primarydata.com> wrote:
>> On Thu, Nov 20, 2014 at 12:57 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> Hi folks,
>>>
>>> As far as I can tell, receiving a BAD_STATEID on an IO operation on a
>>> delegated stateid when there was a local lock acquired for this IO is
>>> unrecoverable — leads to EIO. Codewise, stateid recovery sees that it
>>> has a local lock and marks it lost and retry of the IO operation
>>> returns the EIO.
>>>
>>> Is the reason for seizing the IO is that if the server for some reason
>>> revoked this stateid then there is no guarantee about the locks and
>>> thus doing any IO.
>>>
>>> This also applies to both 4.0 and 4.1 code as far as I can tell.
>>>
>>> Can somebody confirm or tell me if this is wrong?
>>>
>>
>> Yes. If the server has lost the lock, then the application has lost
>> atomicity for the set of operations that were supposed to be protected
>> by that lock, and this is why we return the EIO. In older kernels we
>> did try to recover the lock, but that can lead to application-visible
>> corruption of data, and so we no longer do that unless you explicitly
>> set the nfs 'recover_lost_locks' module parameter - see
>> Documentation/kernel-parameters.txt.
>>
>> --
>> Trond Myklebust
>>
>> Linux NFS client maintainer, PrimaryData
>>
>> trond.myklebust@primarydata.com
---
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index ad7cf7e..4dda0f1 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2485,7 +2485,8 @@ static int _nfs4_do_setattr(struct inode *inode, struct rp
        truncate = (sattr->ia_valid & ATTR_SIZE) ? true : false;
        fmode = truncate ? FMODE_WRITE : FMODE_READ;

-       if (nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
+       if (state != NULL &&
+               nfs4_copy_delegation_stateid(&arg.stateid, inode, fmode)) {
                /* Use that stateid */
        } else if (truncate && state != NULL) {

Solution #2: If we get a stateid-like error on a SETATTR and we have
no state, return/remove bad delegation and retry the call again which
will have the code pick the zero stateid.

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index ad7cf7e..be16143 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2535,6 +2535,14 @@ static int nfs4_do_setattr(struct inode *inode, struct rp
                                        err = -EACCES;
                                goto out;
                        }
+               case -NFS4ERR_DELEG_REVOKED:
+               case -NFS4ERR_ADMIN_REVOKED:
+               case -NFS4ERR_BAD_STATEID:
+                       if (state == NULL) {
+                               nfs4_inode_return_delegation(inode);
+                               exception.retry = 1;
+                               continue;
+                       }
                }
                err = nfs4_handle_exception(server, err, &exception);
        } while (exception.retry);