mbox series

[v4,0/3] support NVMe smart critial warning injection

Message ID 20210115032702.466631-1-pizhenwei@bytedance.com (mailing list archive)
Headers show
Series support NVMe smart critial warning injection | expand

Message

zhenwei pi Jan. 15, 2021, 3:26 a.m. UTC
v3 -> v4:
- Drop "Fix overwritten bar.cap". (Already fixed)

- Avoid to enqueue the duplicate event.

- Several minor changes for coding style & function/variable name.

v2 -> v3:
- Introduce "Persistent Memory Region has become read-only or
  unreliable"

- Fix overwritten bar.cap

- Check smart critical warning value from QOM.

- Trigger asynchronous event during smart warning injection.

v1 -> v2:
- Suggested by Philippe & Klaus, set/get smart_critical_warning by QMP.

v1:
- Add smart_critical_warning for nvme device which can be set by QEMU
  command line to emulate hardware error.

Zhenwei Pi (3):
  block/nvme: introduce bit 5 for critical warning
  hw/block/nvme: add smart_critical_warning property
  hw/blocl/nvme: trigger async event during injecting smart warning

 hw/block/nvme.c      | 91 +++++++++++++++++++++++++++++++++++++++-----
 hw/block/nvme.h      |  1 +
 include/block/nvme.h |  3 ++
 3 files changed, 86 insertions(+), 9 deletions(-)

Comments

Klaus Jensen Jan. 18, 2021, 9:34 a.m. UTC | #1
On Jan 15 11:26, zhenwei pi wrote:
> v3 -> v4:
> - Drop "Fix overwritten bar.cap". (Already fixed)
> 
> - Avoid to enqueue the duplicate event.
> 
> - Several minor changes for coding style & function/variable name.
> 
> v2 -> v3:
> - Introduce "Persistent Memory Region has become read-only or
>   unreliable"
> 
> - Fix overwritten bar.cap
> 
> - Check smart critical warning value from QOM.
> 
> - Trigger asynchronous event during smart warning injection.
> 
> v1 -> v2:
> - Suggested by Philippe & Klaus, set/get smart_critical_warning by QMP.
> 
> v1:
> - Add smart_critical_warning for nvme device which can be set by QEMU
>   command line to emulate hardware error.
> 
> Zhenwei Pi (3):
>   block/nvme: introduce bit 5 for critical warning
>   hw/block/nvme: add smart_critical_warning property
>   hw/blocl/nvme: trigger async event during injecting smart warning
> 
>  hw/block/nvme.c      | 91 +++++++++++++++++++++++++++++++++++++++-----
>  hw/block/nvme.h      |  1 +
>  include/block/nvme.h |  3 ++
>  3 files changed, 86 insertions(+), 9 deletions(-)
> 

This looks pretty good to me.

I think maybe we want to handle the duplicate event stuff more generally
from the AER/AEN code, but this does the job.

Tested-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
zhenwei pi Jan. 19, 2021, 2:05 a.m. UTC | #2
On 1/18/21 5:34 PM, Klaus Jensen wrote:
> On Jan 15 11:26, zhenwei pi wrote:
>> v3 -> v4:
>> - Drop "Fix overwritten bar.cap". (Already fixed)
>>
>> - Avoid to enqueue the duplicate event.
>>
>> - Several minor changes for coding style & function/variable name.
>>
>> v2 -> v3:
>> - Introduce "Persistent Memory Region has become read-only or
>>    unreliable"
>>
>> - Fix overwritten bar.cap
>>
>> - Check smart critical warning value from QOM.
>>
>> - Trigger asynchronous event during smart warning injection.
>>
>> v1 -> v2:
>> - Suggested by Philippe & Klaus, set/get smart_critical_warning by QMP.
>>
>> v1:
>> - Add smart_critical_warning for nvme device which can be set by QEMU
>>    command line to emulate hardware error.
>>
>> Zhenwei Pi (3):
>>    block/nvme: introduce bit 5 for critical warning
>>    hw/block/nvme: add smart_critical_warning property
>>    hw/blocl/nvme: trigger async event during injecting smart warning
>>
>>   hw/block/nvme.c      | 91 +++++++++++++++++++++++++++++++++++++++-----
>>   hw/block/nvme.h      |  1 +
>>   include/block/nvme.h |  3 ++
>>   3 files changed, 86 insertions(+), 9 deletions(-)
>>
> 
> This looks pretty good to me.
> 
> I think maybe we want to handle the duplicate event stuff more generally
> from the AER/AEN code, but this does the job.
> 
> Tested-by: Klaus Jensen <k.jensen@samsung.com>
> Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
> 

What's the next step I should take? Should I push a new version to 
implement this purpose? From my understanding, before inserting a new 
event to aer_queue, I can parse all the pending aer to find the same event.

nvme_enqueue_event()
{
     ...

     QTAILQ_FOREACH_SAFE(event, &n->aer_queue, entry, next) {
         if ((event->result.event_type == event_type)
             && (event->result.event_info == event_info)
             && (event->result.log_page == log_page))
             return;
     }

     QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry); 
 
 

     n->aer_queued++;
     ...
}
Klaus Jensen Jan. 19, 2021, 5:01 a.m. UTC | #3
On Jan 19 10:05, zhenwei pi wrote:
> On 1/18/21 5:34 PM, Klaus Jensen wrote:
> > On Jan 15 11:26, zhenwei pi wrote:
> > > v3 -> v4:
> > > - Drop "Fix overwritten bar.cap". (Already fixed)
> > > 
> > > - Avoid to enqueue the duplicate event.
> > > 
> > > - Several minor changes for coding style & function/variable name.
> > > 
> > > v2 -> v3:
> > > - Introduce "Persistent Memory Region has become read-only or
> > >    unreliable"
> > > 
> > > - Fix overwritten bar.cap
> > > 
> > > - Check smart critical warning value from QOM.
> > > 
> > > - Trigger asynchronous event during smart warning injection.
> > > 
> > > v1 -> v2:
> > > - Suggested by Philippe & Klaus, set/get smart_critical_warning by QMP.
> > > 
> > > v1:
> > > - Add smart_critical_warning for nvme device which can be set by QEMU
> > >    command line to emulate hardware error.
> > > 
> > > Zhenwei Pi (3):
> > >    block/nvme: introduce bit 5 for critical warning
> > >    hw/block/nvme: add smart_critical_warning property
> > >    hw/blocl/nvme: trigger async event during injecting smart warning
> > > 
> > >   hw/block/nvme.c      | 91 +++++++++++++++++++++++++++++++++++++++-----
> > >   hw/block/nvme.h      |  1 +
> > >   include/block/nvme.h |  3 ++
> > >   3 files changed, 86 insertions(+), 9 deletions(-)
> > > 
> > 
> > This looks pretty good to me.
> > 
> > I think maybe we want to handle the duplicate event stuff more generally
> > from the AER/AEN code, but this does the job.
> > 
> > Tested-by: Klaus Jensen <k.jensen@samsung.com>
> > Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
> > 
> 
> What's the next step I should take? Should I push a new version to implement
> this purpose? From my understanding, before inserting a new event to
> aer_queue, I can parse all the pending aer to find the same event.
> 
> nvme_enqueue_event()
> {
>     ...
> 
>     QTAILQ_FOREACH_SAFE(event, &n->aer_queue, entry, next) {
>         if ((event->result.event_type == event_type)
>             && (event->result.event_info == event_info)
>             && (event->result.log_page == log_page))
>             return;
>     }
> 
>     QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry);
> 
> 
> 
>     n->aer_queued++;
>     ...
> }
> 

No, I'll pick up your series as is, I'll pick it up for nvme-next later
today if noone complains! :)
Klaus Jensen Jan. 20, 2021, 9:21 a.m. UTC | #4
On Jan 15 11:26, zhenwei pi wrote:
> v3 -> v4:
> - Drop "Fix overwritten bar.cap". (Already fixed)
> 
> - Avoid to enqueue the duplicate event.
> 
> - Several minor changes for coding style & function/variable name.
> 
> v2 -> v3:
> - Introduce "Persistent Memory Region has become read-only or
>   unreliable"
> 
> - Fix overwritten bar.cap
> 
> - Check smart critical warning value from QOM.
> 
> - Trigger asynchronous event during smart warning injection.
> 
> v1 -> v2:
> - Suggested by Philippe & Klaus, set/get smart_critical_warning by QMP.
> 
> v1:
> - Add smart_critical_warning for nvme device which can be set by QEMU
>   command line to emulate hardware error.
> 
> Zhenwei Pi (3):
>   block/nvme: introduce bit 5 for critical warning
>   hw/block/nvme: add smart_critical_warning property
>   hw/blocl/nvme: trigger async event during injecting smart warning
> 
>  hw/block/nvme.c      | 91 +++++++++++++++++++++++++++++++++++++++-----
>  hw/block/nvme.h      |  1 +
>  include/block/nvme.h |  3 ++
>  3 files changed, 86 insertions(+), 9 deletions(-)
> 

Thanks! Applied to nvme-next.