diff mbox series

megasa regression in 7.1?

Message ID 4A0D1260-DB8D-47CA-9369-6F3C0B7296C9@oracle.com (mailing list archive)
State New, archived
Headers show
Series megasa regression in 7.1? | expand

Commit Message

John Johnson Oct. 28, 2022, 5:47 a.m. UTC
I pulled 7.1, and the megasas driver stopped being able to do reads from a disk.
It looks to be related to this commit:

https://github.com/qemu/qemu/commit/fe9d8927e265fd723a6dc87cd6d220f4677dbe1f#diffe3f5f30efc54747e0624dca63e5f55f0012736c1875b6e85526b3514e6911be3

which added some command buffer bounds checking to the SCSI subsysem.  Unfortunately,
when the megasas QEMU emulation receives a direct I/O command from the device driver
in megasas_handle_io(), it synthesizes a SCSI command from it in megasas_encode_lba(),
but passes the command buffer length from the driver frame instead of the length of the
buffer it synthesized the SCSI command in.  The driver (at least the Linux 4.14 version
I’m using) does not fill in the command buffer length in direct I/O frames, so
scsi_req_new() sees a 0 length command and fails it.


	I worked around this issue with:


and the driver can read the disk again, but I’m not sure this is the correct
fix since cdb_len is used for bounds checking elsewhere in megagsas_handle_io(),
although a 0 won’t fail there.

	Is there anyone with megasas experience who could comment on this?
Thanks

								JJ

Comments

Thomas Huth Oct. 28, 2022, 9:10 a.m. UTC | #1
On 28/10/2022 07.47, John Johnson wrote:
> 
> 	I pulled 7.1, and the megasas driver stopped being able to do reads from a disk.
> It looks to be related to this commit:
> 
> https://github.com/qemu/qemu/commit/fe9d8927e265fd723a6dc87cd6d220f4677dbe1f#diffe3f5f30efc54747e0624dca63e5f55f0012736c1875b6e85526b3514e6911be3
> 
> which added some command buffer bounds checking to the SCSI subsysem.  Unfortunately,
> when the megasas QEMU emulation receives a direct I/O command from the device driver
> in megasas_handle_io(), it synthesizes a SCSI command from it in megasas_encode_lba(),
> but passes the command buffer length from the driver frame instead of the length of the
> buffer it synthesized the SCSI command in.  The driver (at least the Linux 4.14 version
> I’m using) does not fill in the command buffer length in direct I/O frames, so
> scsi_req_new() sees a 0 length command and fails it.
> 
> 
> 	I worked around this issue with:
> 
> diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
> index 7082456..6e11607 100644
> --- a/hw/scsi/megasas.c
> +++ b/hw/scsi/megasas.c
> @@ -1823,7 +1823,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
>   
>       megasas_encode_lba(cdb, lba_start, lba_count, is_write);
>       cmd->req = scsi_req_new(sdev, cmd->index,
> -                            lun_id, cdb, cdb_len, cmd);
> +                            lun_id, cdb, sizeof (cdb), cmd);
>       if (!cmd->req) {
>           trace_megasas_scsi_req_alloc_failed(
>               mfi_frame_desc(frame_cmd), target_id, lun_id);
> 
> and the driver can read the disk again, but I’m not sure this is the correct
> fix since cdb_len is used for bounds checking elsewhere in megagsas_handle_io(),
> although a 0 won’t fail there.
> 
> 	Is there anyone with megasas experience who could comment on this?

No clue about that megasas problem, but it might help if you put the experts 
on CC: (which can be found in the MAINTAINERS file). Done now.

  Thomas
John Millikin Oct. 28, 2022, 10:48 a.m. UTC | #2
Passing `sizeof(cdb)` to `scsi_req_new()` looks like a correct fix to
me, but I'm not familiar enough with megasas / MegaRAID to be confident.

A possible slight alteration is to have `megasas_encode_lba()` return
the length of the CDB it synthesized, which IMO would make the
dependency more clear.

Two additional thoughts:

  1. The variable is called `cdb_len`, but maybe it would be better to
     have two separate variables `megasas_cdb_len` and `scsi_cdb_len`
     (with the buffer renamed to `scsi_cdb`).

  2. There is very similar logic in `megasas_handle_scsi()`, but in that
     function both `cdb` and `cdb_len` are obtained from the `MegasasCmd`.
     Would it be possible to use either an auxiliary function or a comment
     to disambiguate the expected meaning of "CDB" in both cases?

In general the QEMU code is written in a much terser style than I'm used
to, and I don't know to what extent reusing the same variable name with
different semantics is considered idiomatic here.


On Fri, Oct 28, 2022 at 11:10:46AM +0200, Thomas Huth wrote:
> On 28/10/2022 07.47, John Johnson wrote:
> > 
> > 	I pulled 7.1, and the megasas driver stopped being able to do reads from a disk.
> > It looks to be related to this commit:
> > 
> > https://github.com/qemu/qemu/commit/fe9d8927e265fd723a6dc87cd6d220f4677dbe1f#diffe3f5f30efc54747e0624dca63e5f55f0012736c1875b6e85526b3514e6911be3
> > 
> > which added some command buffer bounds checking to the SCSI subsysem.  Unfortunately,
> > when the megasas QEMU emulation receives a direct I/O command from the device driver
> > in megasas_handle_io(), it synthesizes a SCSI command from it in megasas_encode_lba(),
> > but passes the command buffer length from the driver frame instead of the length of the
> > buffer it synthesized the SCSI command in.  The driver (at least the Linux 4.14 version
> > I’m using) does not fill in the command buffer length in direct I/O frames, so
> > scsi_req_new() sees a 0 length command and fails it.
> > 
> > 
> > 	I worked around this issue with:
> > 
> > diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
> > index 7082456..6e11607 100644
> > --- a/hw/scsi/megasas.c
> > +++ b/hw/scsi/megasas.c
> > @@ -1823,7 +1823,7 @@ static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
> >       megasas_encode_lba(cdb, lba_start, lba_count, is_write);
> >       cmd->req = scsi_req_new(sdev, cmd->index,
> > -                            lun_id, cdb, cdb_len, cmd);
> > +                            lun_id, cdb, sizeof (cdb), cmd);
> >       if (!cmd->req) {
> >           trace_megasas_scsi_req_alloc_failed(
> >               mfi_frame_desc(frame_cmd), target_id, lun_id);
> > 
> > and the driver can read the disk again, but I’m not sure this is the correct
> > fix since cdb_len is used for bounds checking elsewhere in megagsas_handle_io(),
> > although a 0 won’t fail there.
> > 
> > 	Is there anyone with megasas experience who could comment on this?
> 
> No clue about that megasas problem, but it might help if you put the experts
> on CC: (which can be found in the MAINTAINERS file). Done now.
> 
>  Thomas
>
diff mbox series

Patch

diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index 7082456..6e11607 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -1823,7 +1823,7 @@  static int megasas_handle_io(MegasasState *s, MegasasCmd *cmd, int frame_cmd)
 
     megasas_encode_lba(cdb, lba_start, lba_count, is_write);
     cmd->req = scsi_req_new(sdev, cmd->index,
-                            lun_id, cdb, cdb_len, cmd);
+                            lun_id, cdb, sizeof (cdb), cmd);
     if (!cmd->req) {
         trace_megasas_scsi_req_alloc_failed(
             mfi_frame_desc(frame_cmd), target_id, lun_id);