Message ID | CAAKa2jn2vwC6oZtrL6CtbQ5U7r7tS4GXmmALjCMSby6KiupeAQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix assertion failure in lsi53c810 emulator | expand |
Hi, On 6/12/21 8:24 AM, Liu Cyrus wrote: > Hi folks, this is a suggestion for fixing this bug. > I'm willing to discuss this with you because I'm new to contribute to QEMU. Thanks for your fix! You didn't Cc'ed the maintainers of the SCSI subsystem (see https://wiki.qemu.org/Contribute/SubmitAPatch#CC_the_relevant_maintainer ) so I'm doing it for you: $ ./scripts/get_maintainer.pl -f hw/scsi/lsi53c895a.c Paolo Bonzini <pbonzini@redhat.com> (supporter:SCSI) Fam Zheng <fam@euphon.net> (reviewer:SCSI) > > Best, > Qiang Liu > From: cyruscyliu <cyruscyliu@gmail.com <mailto:cyruscyliu@gmail.com>> > > An assertion failure can be triggered in the lsi53c810 emulator by a guest > when ((s->sstat1 & 0x7) == PHASE_DO) || (s->sstat1 & 0x7) == PHASE_DI)) > && (!s->current) holds. > Check s->sstat1 and s->current in lsi_reg_writeb before lsi_execute_script() > to discard this MMIO write. > > Fixes: 7d8406be69ce ("PCI SCSI HBA emulation.") > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/305 > <https://gitlab.com/qemu-project/qemu/-/issues/305> > Buglink: https://bugs.launchpad.net/qemu/+bug/1908515 > <https://bugs.launchpad.net/qemu/+bug/1908515> It seems you didn't send your patch with the proper tool, see https://wiki.qemu.org/Contribute/SubmitAPatch#Use_git_format-patch Please also mention the reporter: Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr> > Signed-off-by: cyruscyliu <cyruscyliu@gmail.com > <mailto:cyruscyliu@gmail.com>> Also your git-config is missing your name. Fixable using: $ git config user.name "Liu Cyrus" for your QEMU repository, or globally: $ git config --global user.name "Liu Cyrus" > --- > hw/scsi/lsi53c895a.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c > index e2c1918..5c08f7f 100644 > --- a/hw/scsi/lsi53c895a.c > +++ b/hw/scsi/lsi53c895a.c > @@ -1919,6 +1919,10 @@ static void lsi_reg_writeb(LSIState *s, int > offset, uint8_t val) > lsi_update_irq(s); > } > if (s->waiting == LSI_WAIT_RESELECT && val & LSI_ISTAT0_SIGP) { > + if (!(((((s->sstat1 & 0x7) == PHASE_DO) > + || (s->sstat1 & 0x7) == PHASE_DI)) > + && s->current)) > + break; > trace_lsi_awoken(); > s->waiting = LSI_NOWAIT; > s->dsp = s->dnad; > @@ -1980,8 +1984,13 @@ static void lsi_reg_writeb(LSIState *s, int > offset, uint8_t val) > * instruction. Is this correct? > */ > if ((s->dmode & LSI_DMODE_MAN) == 0 > - && (s->istat1 & LSI_ISTAT1_SRUN) == 0) > + && (s->istat1 & LSI_ISTAT1_SRUN) == 0) { > + if (!(((((s->sstat1 & 0x7) == PHASE_DO) > + || (s->sstat1 & 0x7) == PHASE_DI)) > + && s->current)) Instead of duplicating multiple times the same complex code, you could add an helper once and call it. Something like: static bool can_execute(LSIState *s) { if (!s->current) { return false; } switch (s->sstat1 & 0x7) { case PHASE_DO: case PHASE_DI: return true; default: return false; } } which is while being more verbose, is easier to read. Then you could use: if (!can_execute(s)) { ... } > + break; > lsi_execute_script(s); > + } > break; > CASE_SET_REG32(dsps, 0x30) > CASE_SET_REG32(scratch[0], 0x34) > @@ -2001,8 +2010,13 @@ static void lsi_reg_writeb(LSIState *s, int > offset, uint8_t val) > * FIXME: if s->waiting != LSI_NOWAIT, this will only execute one > * instruction. Is this correct? > */ > - if ((val & LSI_DCNTL_STD) && (s->istat1 & LSI_ISTAT1_SRUN) == 0) > + if ((val & LSI_DCNTL_STD) && (s->istat1 & LSI_ISTAT1_SRUN) == 0) { > + if (!(((((s->sstat1 & 0x7) == PHASE_DO) > + || (s->sstat1 & 0x7) == PHASE_DI)) > + && s->current)) > + break; > lsi_execute_script(s); > + } > break; > case 0x40: /* SIEN0 */ > s->sien0 = val; > -- > 2.7.4 However back to the bug you are trying to fix, I wonder if your check is correct regarding the hardware we are modelling. Have you looked at the specifications? See https://docs.broadcom.com/doc/12352475 Chapter 5.3 Block Move Instruction (from SCSI SCRIPTS Instruction Set), "DMA Command" register. Why are we reaching these places with s->current == NULL in the first place? We are probably missing something earlier. Regards, Phil.
Hi Phil, > You didn't Cc'ed the maintainers of the SCSI subsystem (see > https://wiki.qemu.org/Contribute/SubmitAPatch#CC_the_relevant_maintainer > ) so I'm doing it for you: Thank you! > It seems you didn't send your patch with the proper tool, see > https://wiki.qemu.org/Contribute/SubmitAPatch#Use_git_format-patch > > Please also mention the reporter: > > Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr> > > Also your git-config is missing your name. Fixable using: > > $ git config user.name "Liu Cyrus" > > for your QEMU repository, or globally: > > $ git config --global user.name "Liu Cyrus" Thank you again. I'm sorry that I do miss several basic settings. I will do better next time. > Instead of duplicating multiple times the same complex code, you could > add a helper once and call it. This is really a good idea. > However back to the bug you are trying to fix, I wonder if your check > is correct regarding the hardware we are modeling. Have you looked > at the specifications? See https://docs.broadcom.com/doc/12352475 > Chapter 5.3 Block Move Instruction (from SCSI SCRIPTS Instruction Set), > "DMA Command" register. To be honest, I didn't check the specification at that time. I formed this patch following the idea that we could discard an invalid MMIO operation to avoid crashing. Does it seem that this idea is not enough to form a good patch? What are the best practices to fix an assertion failure in QEMU? > Why are we reaching these places with s->current == NULL in the first > place? We are probably missing something earlier. I checked the specification for several hours today, but it is too difficult for me. I need more time to understand it and form a better patch. When reproducing the crash, I found that I didn't prepare any script to be executed by lsi_execute_script. However, `insn = read_dword(s, s->dsp)` did get an instruction at `s->dsp`. Maybe I should check this more. Best, Qiang
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c index e2c1918..5c08f7f 100644 --- a/hw/scsi/lsi53c895a.c +++ b/hw/scsi/lsi53c895a.c @@ -1919,6 +1919,10 @@ static void lsi_reg_writeb(LSIState *s, int offset, uint8_t val) lsi_update_irq(s); } if (s->waiting == LSI_WAIT_RESELECT && val & LSI_ISTAT0_SIGP) { + if (!(((((s->sstat1 & 0x7) == PHASE_DO) + || (s->sstat1 & 0x7) == PHASE_DI)) + && s->current)) + break; trace_lsi_awoken(); s->waiting = LSI_NOWAIT;