Message ID | 20170310213715.38258-1-kolga@netapp.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
T24gRnJpLCAyMDE3LTAzLTEwIGF0IDE2OjM3IC0wNTAwLCBPbGdhIEtvcm5pZXZza2FpYSB3cm90 ZToNCj4gSWYgd2Ugc2VudCBhbiBvcGVyYXRpb24gdG8gdGhlICJEUyIgYW5kIGdvdCBhbiBlcnJv ciwgdGhlIGNvZGUNCj4gcmVzZW5kcw0KPiB0byAiTURTIiBidXQgd2hlbiB0aGV5IGFyZSB0aGUg c2FtZSwgaXQgZ2V0cyB0aGUgc2FtZSBlcnJvciBhbmQgZ29lcw0KPiBpbnRvIHRoZSBpbmZpbml0 ZSBsb29wLiBFeGFtcGxlIHdhcyBDT01NSVQgZ2V0dGluZyBFQUNDRVMuDQo+IA0KDQpUaGUgY29y cmVjdCBiZWhhdmlvdXIgd2hlbiBnZXR0aW5nIEVBQ0NFUyBmcm9tIGEgQ09NTUlUIHRvIHRoZSBN RFMNCndvdWxkIGJlIHRvIHJldHJ5IHRoZSBlbnRpcmUgc2VyaWVzIG9mIFdSSVRFIGNhbGxzIHdp dGggc3RhYmxlIHdyaXRlcy4NCklmIHRoZSBzZXJ2ZXIgdGhlbiByZXR1cm5zIHdpdGggYW55dGhp bmcgb3RoZXIgdGhhbiBGSUxFX1NZTkMsIHRoZW4NCkVJTy4NCg0KV2h5IGlzIHRoZSBzZXJ2ZXIg ZmFpbGluZyB0aGUgQ09NTUlUIGhlcmUgaWYgdGhlIGNsaWVudCB0aGlua3MgaXQgc2VudA0KdW5z dGFibGUgd3JpdGVzPw0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVudCBt YWludGFpbmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0K -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Mar 11, 2017, at 10:46 AM, Trond Myklebust <trondmy@primarydata.com> wrote: > > On Fri, 2017-03-10 at 16:37 -0500, Olga Kornievskaia wrote: >> If we sent an operation to the "DS" and got an error, the code >> resends >> to "MDS" but when they are the same, it gets the same error and goes >> into the infinite loop. Example was COMMIT getting EACCES. >> > > The correct behaviour when getting EACCES from a COMMIT to the MDS > would be to retry the entire series of WRITE calls with stable writes. > If the server then returns with anything other than FILE_SYNC, then > EIO. > > Why is the server failing the COMMIT here if the client thinks it sent > unstable writes? This looping was discovered during connectathon testing against the desy server. While I believe Tigran thinks there is an error in his code and EACCES from COMMIT was unexpected, we thought that it would be best if the client didn’t go into infinite loop in this condition given that EACCES is a valid error for COMMIT. If you think this shouldn’t happen in real life, then I’m ok with not pursuing this. On a different but related note, in case when MDS != DS, when the writes are retried they are not retried against the MDS but again are sent to the DS, is that a bug? > > -- > Trond Myklebust > Linux NFS client maintainer, PrimaryData > trond.myklebust@primarydata.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
T24gU3VuLCAyMDE3LTAzLTEyIGF0IDE0OjQwIC0wNDAwLCBPbGdhIEtvcm5pZXZza2FpYSB3cm90 ZToNCj4gPiBPbiBNYXIgMTEsIDIwMTcsIGF0IDEwOjQ2IEFNLCBUcm9uZCBNeWtsZWJ1c3QgPHRy b25kbXlAcHJpbWFyeWRhdGEuDQo+ID4gY29tPiB3cm90ZToNCj4gPiANCj4gPiBPbiBGcmksIDIw MTctMDMtMTAgYXQgMTY6MzcgLTA1MDAsIE9sZ2EgS29ybmlldnNrYWlhIHdyb3RlOg0KPiA+ID4g SWYgd2Ugc2VudCBhbiBvcGVyYXRpb24gdG8gdGhlICJEUyIgYW5kIGdvdCBhbiBlcnJvciwgdGhl IGNvZGUNCj4gPiA+IHJlc2VuZHMNCj4gPiA+IHRvICJNRFMiIGJ1dCB3aGVuIHRoZXkgYXJlIHRo ZSBzYW1lLCBpdCBnZXRzIHRoZSBzYW1lIGVycm9yIGFuZA0KPiA+ID4gZ29lcw0KPiA+ID4gaW50 byB0aGUgaW5maW5pdGUgbG9vcC4gRXhhbXBsZSB3YXMgQ09NTUlUIGdldHRpbmcgRUFDQ0VTLg0K PiA+ID4gDQo+ID4gDQo+ID4gVGhlIGNvcnJlY3QgYmVoYXZpb3VyIHdoZW4gZ2V0dGluZyBFQUND RVMgZnJvbSBhIENPTU1JVCB0byB0aGUgTURTDQo+ID4gd291bGQgYmUgdG8gcmV0cnkgdGhlIGVu dGlyZSBzZXJpZXMgb2YgV1JJVEUgY2FsbHMgd2l0aCBzdGFibGUNCj4gPiB3cml0ZXMuDQo+ID4g SWYgdGhlIHNlcnZlciB0aGVuIHJldHVybnMgd2l0aCBhbnl0aGluZyBvdGhlciB0aGFuIEZJTEVf U1lOQywgdGhlbg0KPiA+IEVJTy4NCj4gPiANCj4gPiBXaHkgaXMgdGhlIHNlcnZlciBmYWlsaW5n IHRoZSBDT01NSVQgaGVyZSBpZiB0aGUgY2xpZW50IHRoaW5rcyBpdA0KPiA+IHNlbnQNCj4gPiB1 bnN0YWJsZSB3cml0ZXM/DQo+IA0KPiBUaGlzIGxvb3Bpbmcgd2FzIGRpc2NvdmVyZWQgZHVyaW5n IGNvbm5lY3RhdGhvbiB0ZXN0aW5nIGFnYWluc3QgdGhlDQo+IGRlc3kgc2VydmVyLiBXaGlsZSBJ IGJlbGlldmUgVGlncmFuIHRoaW5rcyB0aGVyZSBpcyBhbiBlcnJvciBpbiBoaXMNCj4gY29kZSBh bmQgRUFDQ0VTIGZyb20gQ09NTUlUIHdhcyB1bmV4cGVjdGVkLCB3ZSB0aG91Z2h0IHRoYXQgaXQg d291bGQNCj4gYmUgYmVzdCBpZiB0aGUgY2xpZW50IGRpZG7igJl0IGdvIGludG8gaW5maW5pdGUg bG9vcCBpbiB0aGlzIGNvbmRpdGlvbg0KPiBnaXZlbiB0aGF0IEVBQ0NFUyBpcyBhIHZhbGlkIGVy cm9yIGZvciBDT01NSVQuIElmIHlvdSB0aGluayB0aGlzDQo+IHNob3VsZG7igJl0IGhhcHBlbiBp biByZWFsIGxpZmUsIHRoZW4gSeKAmW0gb2sgd2l0aCBub3QgcHVyc3VpbmcgdGhpcy7CoA0KDQpJ dCBpcyBsZWdhbCwgYW5kIEkgY2FuIHNvcnQgb2Ygc2VlIGhvdyBpdCBtaWdodCBiZSBpbnZva2Vk IGlmIGEgU0VUQVRUUg0KY2hhbmdlcyB0aGUgbW9kZSBiZXR3ZWVuIHRoZSBXUklURSBhbmQgdGhl IENPTU1JVCwgYnV0IGl0IGlzIGEgY29ybmVyDQpjYXNlLg0KDQo+IE9uIGEgZGlmZmVyZW50IGJ1 dCByZWxhdGVkIG5vdGUsIGluIGNhc2Ugd2hlbiBNRFMgIT0gRFMsIHdoZW4gdGhlDQo+IHdyaXRl cyBhcmUgcmV0cmllZCB0aGV5IGFyZSBub3QgcmV0cmllZCBhZ2FpbnN0IHRoZSBNRFMgYnV0IGFn YWluIGFyZQ0KPiBzZW50IHRvIHRoZSBEUywgaXMgdGhhdCBhIGJ1Zz8NCg0KSSB3b3VsZCBleHBl Y3QgdGhhdCBvbmNlIHlvdSBkZWNpZGUgdG8gZmFsbCBiYWNrIHRvIGRvaW5nIEkvTyB0aHJvdWdo DQp0aGUgTURTLCB0aGVuIHlvdSdsbCB3YW50IGFueSByZXNlbmRzIHRvIGFsc28gZ28gdG8gdGhl IE1EUy4NCg0KPiA+IA0KPiA+IC0twqANCj4gPiBUcm9uZCBNeWtsZWJ1c3QNCj4gPiBMaW51eCBO RlMgY2xpZW50IG1haW50YWluZXIsIFByaW1hcnlEYXRhDQo+ID4gdHJvbmQubXlrbGVidXN0QHBy aW1hcnlkYXRhLmNvbQ0KPiANCj4gDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNs aWVudCBtYWludGFpbmVyLCBQcmltYXJ5RGF0YQ0KdHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRh LmNvbQ0K -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Mar 12, 2017 at 5:00 PM, Trond Myklebust <trondmy@primarydata.com> wrote: > On Sun, 2017-03-12 at 14:40 -0400, Olga Kornievskaia wrote: >> > On Mar 11, 2017, at 10:46 AM, Trond Myklebust <trondmy@primarydata. >> > com> wrote: >> > >> > On Fri, 2017-03-10 at 16:37 -0500, Olga Kornievskaia wrote: >> > > If we sent an operation to the "DS" and got an error, the code >> > > resends >> > > to "MDS" but when they are the same, it gets the same error and >> > > goes >> > > into the infinite loop. Example was COMMIT getting EACCES. >> > > >> > >> > The correct behaviour when getting EACCES from a COMMIT to the MDS >> > would be to retry the entire series of WRITE calls with stable >> > writes. >> > If the server then returns with anything other than FILE_SYNC, then >> > EIO. >> > Going back to this comment. Then in non-pnfs (4.1 or 4.0), is the expected behavior is to re-send the writes with FILE_SYNC. This is not what the code does now. >> > Why is the server failing the COMMIT here if the client thinks it >> > sent >> > unstable writes? >> >> This looping was discovered during connectathon testing against the >> desy server. While I believe Tigran thinks there is an error in his >> code and EACCES from COMMIT was unexpected, we thought that it would >> be best if the client didn’t go into infinite loop in this condition >> given that EACCES is a valid error for COMMIT. If you think this >> shouldn’t happen in real life, then I’m ok with not pursuing this. > > It is legal, and I can sort of see how it might be invoked if a SETATTR > changes the mode between the WRITE and the COMMIT, but it is a corner > case. > >> On a different but related note, in case when MDS != DS, when the >> writes are retried they are not retried against the MDS but again are >> sent to the DS, is that a bug? > > I would expect that once you decide to fall back to doing I/O through > the MDS, then you'll want any resends to also go to the MDS. I guess I'm not seeing there is a fallback happening if COMMIT fails. I think if writes fail then fallback happens but not the COMMIT. Again I'm not sure if this is worth pursing as you say it's a corner case. >> > >> > -- >> > Trond Myklebust >> > Linux NFS client maintainer, PrimaryData >> > trond.myklebust@primarydata.com >> >> > -- > Trond Myklebust > Linux NFS client maintainer, PrimaryData > trond.myklebust@primarydata.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 13, 2017 at 10:42 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > On Sun, Mar 12, 2017 at 5:00 PM, Trond Myklebust > <trondmy@primarydata.com> wrote: >> On Sun, 2017-03-12 at 14:40 -0400, Olga Kornievskaia wrote: >>> > On Mar 11, 2017, at 10:46 AM, Trond Myklebust <trondmy@primarydata. >>> > com> wrote: >>> > >>> > On Fri, 2017-03-10 at 16:37 -0500, Olga Kornievskaia wrote: >>> > > If we sent an operation to the "DS" and got an error, the code >>> > > resends >>> > > to "MDS" but when they are the same, it gets the same error and >>> > > goes >>> > > into the infinite loop. Example was COMMIT getting EACCES. >>> > > >>> > >>> > The correct behaviour when getting EACCES from a COMMIT to the MDS >>> > would be to retry the entire series of WRITE calls with stable >>> > writes. >>> > If the server then returns with anything other than FILE_SYNC, then >>> > EIO. >>> > > > Going back to this comment. Then in non-pnfs (4.1 or 4.0), is the > expected behavior is to re-send the writes with FILE_SYNC. This is not > what the code does now. > >>> > Why is the server failing the COMMIT here if the client thinks it >>> > sent >>> > unstable writes? >>> >>> This looping was discovered during connectathon testing against the >>> desy server. While I believe Tigran thinks there is an error in his >>> code and EACCES from COMMIT was unexpected, we thought that it would >>> be best if the client didn’t go into infinite loop in this condition >>> given that EACCES is a valid error for COMMIT. If you think this >>> shouldn’t happen in real life, then I’m ok with not pursuing this. >> >> It is legal, and I can sort of see how it might be invoked if a SETATTR >> changes the mode between the WRITE and the COMMIT, but it is a corner >> case. >> >>> On a different but related note, in case when MDS != DS, when the >>> writes are retried they are not retried against the MDS but again are >>> sent to the DS, is that a bug? >> >> I would expect that once you decide to fall back to doing I/O through >> the MDS, then you'll want any resends to also go to the MDS. > > I guess I'm not seeing there is a fallback happening if COMMIT fails. > I think if writes fail then fallback happens but not the COMMIT. > > Again I'm not sure if this is worth pursing as you say it's a corner case. More thoughts about your comment about retrying the WRITEs. If this is indeed a legitimate case of SETATTR changing mode, then WRITEs that are retried will fail with EACCES as well. So why bother retrying? Might as well just return the error to the application. That aside, I still think my patch is needed as it fixes the infinite loop that will exist in the MDS=DS pnfs case. Andy suggested that instead of checking for the same rpc client (which might fail if there is trunking going on in the future), instead to check the flags from the exchange_id from the server. > >>> > >>> > -- >>> > Trond Myklebust >>> > Linux NFS client maintainer, PrimaryData >>> > trond.myklebust@primarydata.com >>> >>> >> -- >> Trond Myklebust >> Linux NFS client maintainer, PrimaryData >> trond.myklebust@primarydata.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfs/filelayout/filelayout.c b/fs/nfs/filelayout/filelayout.c index 44347f4..772be38 100644 --- a/fs/nfs/filelayout/filelayout.c +++ b/fs/nfs/filelayout/filelayout.c @@ -207,6 +207,8 @@ static int filelayout_async_handle_error(struct rpc_task *task, /* fall through */ default: reset: + if (mds_client->cl_rpcclient == clp->cl_rpcclient) + return task->tk_status; dprintk("%s Retry through MDS. Error %d\n", __func__, task->tk_status); return -NFS4ERR_RESET_TO_MDS; @@ -384,9 +386,10 @@ static int filelayout_commit_done_cb(struct rpc_task *task, return -EAGAIN; } - pnfs_set_layoutcommit(data->inode, data->lseg, data->lwb); + if (!err) + pnfs_set_layoutcommit(data->inode, data->lseg, data->lwb); - return 0; + return err; } static void filelayout_write_prepare(struct rpc_task *task, void *data)
If we sent an operation to the "DS" and got an error, the code resends to "MDS" but when they are the same, it gets the same error and goes into the infinite loop. Example was COMMIT getting EACCES. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> --- fs/nfs/filelayout/filelayout.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)