Message ID | F9E001219150CB45BEDC82A650F360C9014674B9@G9W0717.americas.hpqcorp.net (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Adding: linasvepstas@gmail.com On Thu, Nov 8, 2012 at 9:28 PM, Pandarathil, Vijaymohan R <vijaymohan.pandarathil@hp.com> wrote: > When an error is detected on a PCIe device which does not have an > AER-aware driver, prevent AER infrastructure from reporting > successful error recovery. > > This is because the report_error_detected() function that gets > called in the first phase of recovery process allows forward > progress even when the driver for the device does not have AER > capabilities. It seems that all callbacks (in pci_error_handlers > structure) registered by drivers that gets called during error > recovery are not mandatory. So the intention of the infrastructure > design seems to be to allow forward progress even when a specific > callback has not been registered by a driver. However, if error > handler structure itself has not been registered, it doesn't make > sense to allow forward progress. > > As a result of the current design, in the case of a single device > having an AER-unaware driver or in the case of any function in a > multi-function card having an AER-unaware driver, a successful > recovery is reported. > > Typical scenario this happens is when a PCI device is detached > from a KVM host and the pci-stub driver on the host claims the > device. The pci-stub driver does not have error handling capabilities > but the AER infrastructure still reports that the device recovered > successfully. > > The changes proposed here leaves the device in an unrecovered state > if the driver for the device or for any function in a multi-function > card does not have error handler structure registered. This reflects > the true state of the device and prevents any partial recovery (or no > recovery at all) reported as successful. > > Please also see comments from Linas Vepstas at the following link > http://www.spinics.net/lists/linux-pci/msg18572.html > > Signed-off-by: Linas Vepstas <linasvepstas <at> gmail.com> > Signed-off-by: Myron Stowe <mstowe <at> redhat.com> > Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil <at> hp.com> Hay Vijay: I think Linas may have steered you incorrectly with his "FWIW, I guess I could add a Signed-off-by: Linas Vepstas <...>" comment. From ./Documentation/SubmittingPatches, "The Signed-off-by: tag indicates that the signer was involved in the development of the patch, or that he/she was in the patch's delivery path." I this case I suspect that "reviewed-by" or 'acked-by" would be more appropriate for both Linas (unless of course Linas worked with you in developing the patch) and myslef. Also, when posting to the list it's considered good etiquette to include (cc) everyone cited in the Signed-off, reviewed-by, acked-by, ..., lines (and possibly the linux-kernel list - again see ./Documentation/SubmittingPatches). Myron > > --- > > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c > index 06bad96..030b229 100644 > --- a/drivers/pci/pcie/aer/aerdrv_core.c > +++ b/drivers/pci/pcie/aer/aerdrv_core.c > @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev *dev, void *data) > > dev->error_state = result_data->state; > > + if ((!dev->driver || !dev->driver->err_handler) && > + !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) { > + dev_info(&dev->dev, "AER: Error detected but no driver has claimed this device or the driver is AER-unaware\n"); > + result_data->result = PCI_ERS_RESULT_NONE; > + return 1; > + } > if (!dev->driver || > !dev->driver->err_handler || > !dev->driver->err_handler->error_detected) { > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Yes, that is correct: Reviewed-by/Acked-by: me. I plead late-night wooziness. -- Linas On 9 November 2012 11:06, Myron Stowe <myron.stowe@gmail.com> wrote: > Adding: linasvepstas@gmail.com > > On Thu, Nov 8, 2012 at 9:28 PM, Pandarathil, Vijaymohan R > <vijaymohan.pandarathil@hp.com> wrote: >> When an error is detected on a PCIe device which does not have an >> AER-aware driver, prevent AER infrastructure from reporting >> successful error recovery. >> >> This is because the report_error_detected() function that gets >> called in the first phase of recovery process allows forward >> progress even when the driver for the device does not have AER >> capabilities. It seems that all callbacks (in pci_error_handlers >> structure) registered by drivers that gets called during error >> recovery are not mandatory. So the intention of the infrastructure >> design seems to be to allow forward progress even when a specific >> callback has not been registered by a driver. However, if error >> handler structure itself has not been registered, it doesn't make >> sense to allow forward progress. >> >> As a result of the current design, in the case of a single device >> having an AER-unaware driver or in the case of any function in a >> multi-function card having an AER-unaware driver, a successful >> recovery is reported. >> >> Typical scenario this happens is when a PCI device is detached >> from a KVM host and the pci-stub driver on the host claims the >> device. The pci-stub driver does not have error handling capabilities >> but the AER infrastructure still reports that the device recovered >> successfully. >> >> The changes proposed here leaves the device in an unrecovered state >> if the driver for the device or for any function in a multi-function >> card does not have error handler structure registered. This reflects >> the true state of the device and prevents any partial recovery (or no >> recovery at all) reported as successful. >> >> Please also see comments from Linas Vepstas at the following link >> http://www.spinics.net/lists/linux-pci/msg18572.html >> >> Signed-off-by: Linas Vepstas <linasvepstas <at> gmail.com> >> Signed-off-by: Myron Stowe <mstowe <at> redhat.com> >> Signed-off-by: Vijay Mohan Pandarathil <vijaymohan.pandarathil <at> hp.com> > > Hay Vijay: > > I think Linas may have steered you incorrectly with his "FWIW, I guess > I could add a > Signed-off-by: Linas Vepstas <...>" comment. > > From ./Documentation/SubmittingPatches, "The Signed-off-by: tag > indicates that the > signer was involved in the development of the patch, or that he/she > was in the patch's > delivery path." > > I this case I suspect that "reviewed-by" or 'acked-by" would be more > appropriate for > both Linas (unless of course Linas worked with you in developing the patch) and > myslef. > > Also, when posting to the list it's considered good etiquette to > include (cc) everyone > cited in the Signed-off, reviewed-by, acked-by, ..., lines (and > possibly the linux-kernel > list - again see ./Documentation/SubmittingPatches). > > Myron >> >> --- >> >> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c >> index 06bad96..030b229 100644 >> --- a/drivers/pci/pcie/aer/aerdrv_core.c >> +++ b/drivers/pci/pcie/aer/aerdrv_core.c >> @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev *dev, void *data) >> >> dev->error_state = result_data->state; >> >> + if ((!dev->driver || !dev->driver->err_handler) && >> + !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) { >> + dev_info(&dev->dev, "AER: Error detected but no driver has claimed this device or the driver is AER-unaware\n"); >> + result_data->result = PCI_ERS_RESULT_NONE; >> + return 1; >> + } >> if (!dev->driver || >> !dev->driver->err_handler || >> !dev->driver->err_handler->error_detected) { >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-pci" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
SGkgTXlyb24vTGluYXMsDQoNClRoYW5rcyBmb3IgcG9pbnRpbmcgaXQgb3V0LiBJIGFtIHN0aWxs IGxlYXJuaW5nIGFsbCB0aGUgY29udmVudGlvbnMuIEkgd2lsbCBzZW5kIG91dCBhIG5ld2VyIHZl cnNpb24gd2l0aCB0aGUgY2hhbmdlcy4NCg0KVGhhbmtzDQoNClZpamF5DQoNCj4gLS0tLS1Pcmln aW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogTGluYXMgVmVwc3RhcyBbbWFpbHRvOmxpbmFzdmVw c3Rhc0BnbWFpbC5jb21dDQo+IFNlbnQ6IEZyaWRheSwgTm92ZW1iZXIgMDksIDIwMTIgNzoyOCBQ TQ0KPiBUbzogTXlyb24gU3Rvd2UNCj4gQ2M6IFBhbmRhcmF0aGlsLCBWaWpheW1vaGFuIFI7IGxp bnV4LXBjaUB2Z2VyLmtlcm5lbC5vcmcNCj4gU3ViamVjdDogUmU6IFsgUEFUQ0ggXSBQQ0ktQUVS OiBEbyBub3QgcmVwb3J0IHN1Y2Nlc3NmdWwgZXJyb3IgcmVjb3ZlcnkgZm9yDQo+IGRldmljZXMg d2l0aCBBRVItdW5hd2FyZSBkcml2ZXJzDQo+IA0KPiBZZXMsIHRoYXQgaXMgY29ycmVjdDogUmV2 aWV3ZWQtYnkvQWNrZWQtYnk6IG1lLiBJIHBsZWFkIGxhdGUtbmlnaHQNCj4gd29vemluZXNzLg0K PiANCj4gLS0gTGluYXMNCj4gDQo+IE9uIDkgTm92ZW1iZXIgMjAxMiAxMTowNiwgTXlyb24gU3Rv d2UgPG15cm9uLnN0b3dlQGdtYWlsLmNvbT4gd3JvdGU6DQo+ID4gQWRkaW5nOiBsaW5hc3ZlcHN0 YXNAZ21haWwuY29tDQo+ID4NCj4gPiBPbiBUaHUsIE5vdiA4LCAyMDEyIGF0IDk6MjggUE0sIFBh bmRhcmF0aGlsLCBWaWpheW1vaGFuIFINCj4gPiA8dmlqYXltb2hhbi5wYW5kYXJhdGhpbEBocC5j b20+IHdyb3RlOg0KPiA+PiBXaGVuIGFuIGVycm9yIGlzIGRldGVjdGVkIG9uIGEgUENJZSBkZXZp Y2Ugd2hpY2ggZG9lcyBub3QgaGF2ZSBhbg0KPiA+PiBBRVItYXdhcmUgZHJpdmVyLCBwcmV2ZW50 IEFFUiBpbmZyYXN0cnVjdHVyZSBmcm9tIHJlcG9ydGluZw0KPiA+PiBzdWNjZXNzZnVsIGVycm9y IHJlY292ZXJ5Lg0KPiA+Pg0KPiA+PiBUaGlzIGlzIGJlY2F1c2UgdGhlIHJlcG9ydF9lcnJvcl9k ZXRlY3RlZCgpIGZ1bmN0aW9uIHRoYXQgZ2V0cw0KPiA+PiBjYWxsZWQgaW4gdGhlIGZpcnN0IHBo YXNlIG9mIHJlY292ZXJ5IHByb2Nlc3MgYWxsb3dzIGZvcndhcmQNCj4gPj4gcHJvZ3Jlc3MgZXZl biB3aGVuIHRoZSBkcml2ZXIgZm9yIHRoZSBkZXZpY2UgZG9lcyBub3QgaGF2ZSBBRVINCj4gPj4g Y2FwYWJpbGl0aWVzLiBJdCBzZWVtcyB0aGF0IGFsbCBjYWxsYmFja3MgKGluIHBjaV9lcnJvcl9o YW5kbGVycw0KPiA+PiBzdHJ1Y3R1cmUpIHJlZ2lzdGVyZWQgYnkgZHJpdmVycyB0aGF0IGdldHMg Y2FsbGVkIGR1cmluZyBlcnJvcg0KPiA+PiByZWNvdmVyeSBhcmUgbm90IG1hbmRhdG9yeS4gU28g dGhlIGludGVudGlvbiBvZiB0aGUgaW5mcmFzdHJ1Y3R1cmUNCj4gPj4gZGVzaWduIHNlZW1zIHRv IGJlIHRvIGFsbG93IGZvcndhcmQgcHJvZ3Jlc3MgZXZlbiB3aGVuIGEgc3BlY2lmaWMNCj4gPj4g Y2FsbGJhY2sgaGFzIG5vdCBiZWVuIHJlZ2lzdGVyZWQgYnkgYSBkcml2ZXIuIEhvd2V2ZXIsIGlm IGVycm9yDQo+ID4+IGhhbmRsZXIgc3RydWN0dXJlIGl0c2VsZiBoYXMgbm90IGJlZW4gcmVnaXN0 ZXJlZCwgaXQgZG9lc24ndCBtYWtlDQo+ID4+IHNlbnNlIHRvIGFsbG93IGZvcndhcmQgcHJvZ3Jl c3MuDQo+ID4+DQo+ID4+IEFzIGEgcmVzdWx0IG9mIHRoZSBjdXJyZW50IGRlc2lnbiwgaW4gdGhl IGNhc2Ugb2YgYSBzaW5nbGUgZGV2aWNlDQo+ID4+IGhhdmluZyBhbiBBRVItdW5hd2FyZSBkcml2 ZXIgb3IgaW4gdGhlIGNhc2Ugb2YgYW55IGZ1bmN0aW9uIGluIGENCj4gPj4gbXVsdGktZnVuY3Rp b24gY2FyZCBoYXZpbmcgYW4gQUVSLXVuYXdhcmUgZHJpdmVyLCBhIHN1Y2Nlc3NmdWwNCj4gPj4g cmVjb3ZlcnkgaXMgcmVwb3J0ZWQuDQo+ID4+DQo+ID4+IFR5cGljYWwgc2NlbmFyaW8gdGhpcyBo YXBwZW5zIGlzIHdoZW4gYSBQQ0kgZGV2aWNlIGlzIGRldGFjaGVkDQo+ID4+IGZyb20gYSBLVk0g aG9zdCBhbmQgdGhlIHBjaS1zdHViIGRyaXZlciBvbiB0aGUgaG9zdCBjbGFpbXMgdGhlDQo+ID4+ IGRldmljZS4gVGhlIHBjaS1zdHViIGRyaXZlciBkb2VzIG5vdCBoYXZlIGVycm9yIGhhbmRsaW5n IGNhcGFiaWxpdGllcw0KPiA+PiBidXQgdGhlIEFFUiBpbmZyYXN0cnVjdHVyZSBzdGlsbCByZXBv cnRzIHRoYXQgdGhlIGRldmljZSByZWNvdmVyZWQNCj4gPj4gc3VjY2Vzc2Z1bGx5Lg0KPiA+Pg0K PiA+PiBUaGUgY2hhbmdlcyBwcm9wb3NlZCBoZXJlIGxlYXZlcyB0aGUgZGV2aWNlIGluIGFuIHVu cmVjb3ZlcmVkIHN0YXRlDQo+ID4+IGlmIHRoZSBkcml2ZXIgZm9yIHRoZSBkZXZpY2Ugb3IgZm9y IGFueSBmdW5jdGlvbiBpbiBhIG11bHRpLWZ1bmN0aW9uDQo+ID4+IGNhcmQgZG9lcyBub3QgaGF2 ZSBlcnJvciBoYW5kbGVyIHN0cnVjdHVyZSByZWdpc3RlcmVkLiBUaGlzIHJlZmxlY3RzDQo+ID4+ IHRoZSB0cnVlIHN0YXRlIG9mIHRoZSBkZXZpY2UgYW5kIHByZXZlbnRzIGFueSBwYXJ0aWFsIHJl Y292ZXJ5IChvciBubw0KPiA+PiByZWNvdmVyeSBhdCBhbGwpIHJlcG9ydGVkIGFzIHN1Y2Nlc3Nm dWwuDQo+ID4+DQo+ID4+IFBsZWFzZSBhbHNvIHNlZSBjb21tZW50cyBmcm9tIExpbmFzIFZlcHN0 YXMgYXQgdGhlIGZvbGxvd2luZyBsaW5rDQo+ID4+IGh0dHA6Ly93d3cuc3Bpbmljcy5uZXQvbGlz dHMvbGludXgtcGNpL21zZzE4NTcyLmh0bWwNCj4gPj4NCj4gPj4gU2lnbmVkLW9mZi1ieTogTGlu YXMgVmVwc3RhcyA8bGluYXN2ZXBzdGFzIDxhdD4gZ21haWwuY29tPg0KPiA+PiBTaWduZWQtb2Zm LWJ5OiBNeXJvbiBTdG93ZSA8bXN0b3dlIDxhdD4gcmVkaGF0LmNvbT4NCj4gPj4gU2lnbmVkLW9m Zi1ieTogVmlqYXkgTW9oYW4gUGFuZGFyYXRoaWwgPHZpamF5bW9oYW4ucGFuZGFyYXRoaWwgPGF0 Pg0KPiBocC5jb20+DQo+ID4NCj4gPiBIYXkgVmlqYXk6DQo+ID4NCj4gPiBJIHRoaW5rIExpbmFz IG1heSBoYXZlIHN0ZWVyZWQgeW91IGluY29ycmVjdGx5IHdpdGggaGlzICJGV0lXLCBJIGd1ZXNz DQo+ID4gSSBjb3VsZCBhZGQgYQ0KPiA+IFNpZ25lZC1vZmYtYnk6IExpbmFzIFZlcHN0YXMgPC4u Lj4iIGNvbW1lbnQuDQo+ID4NCj4gPiBGcm9tIC4vRG9jdW1lbnRhdGlvbi9TdWJtaXR0aW5nUGF0 Y2hlcywgIlRoZSBTaWduZWQtb2ZmLWJ5OiB0YWcNCj4gPiBpbmRpY2F0ZXMgdGhhdCB0aGUNCj4g PiBzaWduZXIgd2FzIGludm9sdmVkIGluIHRoZSBkZXZlbG9wbWVudCBvZiB0aGUgcGF0Y2gsIG9y IHRoYXQgaGUvc2hlDQo+ID4gd2FzIGluIHRoZSBwYXRjaCdzDQo+ID4gZGVsaXZlcnkgcGF0aC4i DQo+ID4NCj4gPiBJIHRoaXMgY2FzZSBJIHN1c3BlY3QgdGhhdCAicmV2aWV3ZWQtYnkiIG9yICdh Y2tlZC1ieSIgd291bGQgYmUgbW9yZQ0KPiA+IGFwcHJvcHJpYXRlIGZvcg0KPiA+IGJvdGggTGlu YXMgKHVubGVzcyBvZiBjb3Vyc2UgTGluYXMgd29ya2VkIHdpdGggeW91IGluIGRldmVsb3Bpbmcg dGhlDQo+IHBhdGNoKSBhbmQNCj4gPiBteXNsZWYuDQo+ID4NCj4gPiBBbHNvLCB3aGVuIHBvc3Rp bmcgdG8gdGhlIGxpc3QgaXQncyBjb25zaWRlcmVkIGdvb2QgZXRpcXVldHRlIHRvDQo+ID4gaW5j bHVkZSAoY2MpIGV2ZXJ5b25lDQo+ID4gY2l0ZWQgaW4gdGhlIFNpZ25lZC1vZmYsIHJldmlld2Vk LWJ5LCBhY2tlZC1ieSwgLi4uLCBsaW5lcyAoYW5kDQo+ID4gcG9zc2libHkgdGhlIGxpbnV4LWtl cm5lbA0KPiA+IGxpc3QgLSBhZ2FpbiBzZWUgLi9Eb2N1bWVudGF0aW9uL1N1Ym1pdHRpbmdQYXRj aGVzKS4NCj4gPg0KPiA+IE15cm9uDQo+ID4+DQo+ID4+IC0tLQ0KPiA+Pg0KPiA+PiBkaWZmIC0t Z2l0IGEvZHJpdmVycy9wY2kvcGNpZS9hZXIvYWVyZHJ2X2NvcmUuYw0KPiBiL2RyaXZlcnMvcGNp L3BjaWUvYWVyL2FlcmRydl9jb3JlLmMNCj4gPj4gaW5kZXggMDZiYWQ5Ni4uMDMwYjIyOSAxMDA2 NDQNCj4gPj4gLS0tIGEvZHJpdmVycy9wY2kvcGNpZS9hZXIvYWVyZHJ2X2NvcmUuYw0KPiA+PiAr KysgYi9kcml2ZXJzL3BjaS9wY2llL2Flci9hZXJkcnZfY29yZS5jDQo+ID4+IEBAIC0yMTUsNiAr MjE1LDEyIEBAIHN0YXRpYyBpbnQgcmVwb3J0X2Vycm9yX2RldGVjdGVkKHN0cnVjdCBwY2lfZGV2 DQo+ICpkZXYsIHZvaWQgKmRhdGEpDQo+ID4+DQo+ID4+ICAgICAgICAgZGV2LT5lcnJvcl9zdGF0 ZSA9IHJlc3VsdF9kYXRhLT5zdGF0ZTsNCj4gPj4NCj4gPj4gKyAgICAgICBpZiAoKCFkZXYtPmRy aXZlciB8fCAhZGV2LT5kcml2ZXItPmVycl9oYW5kbGVyKSAmJg0KPiA+PiArICAgICAgICAgICAg ICAgIShkZXYtPmhkcl90eXBlICYgUENJX0hFQURFUl9UWVBFX0JSSURHRSkpIHsNCj4gPj4gKyAg ICAgICAgICAgICAgIGRldl9pbmZvKCZkZXYtPmRldiwgIkFFUjogRXJyb3IgZGV0ZWN0ZWQgYnV0 IG5vIGRyaXZlcg0KPiBoYXMgY2xhaW1lZCB0aGlzIGRldmljZSBvciB0aGUgZHJpdmVyIGlzIEFF Ui11bmF3YXJlXG4iKTsNCj4gPj4gKyAgICAgICAgICAgICAgIHJlc3VsdF9kYXRhLT5yZXN1bHQg PSBQQ0lfRVJTX1JFU1VMVF9OT05FOw0KPiA+PiArICAgICAgICAgICAgICAgcmV0dXJuIDE7DQo+ ID4+ICsgICAgICAgfQ0KPiA+PiAgICAgICAgIGlmICghZGV2LT5kcml2ZXIgfHwNCj4gPj4gICAg ICAgICAgICAgICAgICFkZXYtPmRyaXZlci0+ZXJyX2hhbmRsZXIgfHwNCj4gPj4gICAgICAgICAg ICAgICAgICFkZXYtPmRyaXZlci0+ZXJyX2hhbmRsZXItPmVycm9yX2RldGVjdGVkKSB7DQo+ID4+ IC0tDQo+ID4+IFRvIHVuc3Vic2NyaWJlIGZyb20gdGhpcyBsaXN0OiBzZW5kIHRoZSBsaW5lICJ1 bnN1YnNjcmliZSBsaW51eC1wY2kiIGluDQo+ID4+IHRoZSBib2R5IG9mIGEgbWVzc2FnZSB0byBt YWpvcmRvbW9Admdlci5rZXJuZWwub3JnDQo+ID4+IE1vcmUgbWFqb3Jkb21vIGluZm8gYXQgIGh0 dHA6Ly92Z2VyLmtlcm5lbC5vcmcvbWFqb3Jkb21vLWluZm8uaHRtbA0K -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c index 06bad96..030b229 100644 --- a/drivers/pci/pcie/aer/aerdrv_core.c +++ b/drivers/pci/pcie/aer/aerdrv_core.c @@ -215,6 +215,12 @@ static int report_error_detected(struct pci_dev *dev, void *data) dev->error_state = result_data->state; + if ((!dev->driver || !dev->driver->err_handler) && + !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) { + dev_info(&dev->dev, "AER: Error detected but no driver has claimed this device or the driver is AER-unaware\n"); + result_data->result = PCI_ERS_RESULT_NONE; + return 1; + } if (!dev->driver || !dev->driver->err_handler || !dev->driver->err_handler->error_detected) {