diff mbox

ide:ide-cd: fix kernel panic resulting from missing scsi_req_init

Message ID 1509552273.2530.25.camel@wdc.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bart Van Assche Nov. 1, 2017, 4:04 p.m. UTC
On Wed, 2017-11-01 at 09:44 +0800, Hongxu Jia wrote:
> On 2017年10月31日 23:23, Bart Van Assche wrote:
> > On Tue, 2017-10-31 at 15:39 +0800, Hongxu Jia wrote:
> > > Since we split the scsi_request out of struct request, while the
> > > standard prep_rq_fn builds 10 byte cmds, it missed to invoke
> > > scsi_req_init() to initialize certain fields of a scsi_request
> > > structure (.__cmd[], .cmd, .cmd_len and .sense_len but no other
> > > members of struct scsi_request).
> > > 
> > > An example panic on virtual machines (qemu/virtualbox) to boot
> > > from IDE cdrom:
> > > ...
> > > [    8.754381] Call Trace:
> > > [    8.755419]  blk_peek_request+0x182/0x2e0
> > > [    8.755863]  blk_fetch_request+0x1c/0x40
> > > [    8.756148]  ? ktime_get+0x40/0xa0
> > > [    8.756385]  do_ide_request+0x37d/0x660
> > > [    8.756704]  ? cfq_group_service_tree_add+0x98/0xc0
> > > [    8.757011]  ? cfq_service_tree_add+0x1e5/0x2c0
> > > [    8.757313]  ? ktime_get+0x40/0xa0
> > > [    8.757544]  __blk_run_queue+0x3d/0x60
> > > [    8.757837]  queue_unplugged+0x2f/0xc0
> > > [    8.758088]  blk_flush_plug_list+0x1f4/0x240
> > > [    8.758362]  blk_finish_plug+0x2c/0x40
> > > ...
> > > [    8.770906] RIP: ide_cdrom_prep_fn+0x63/0x180 RSP: ffff92aec018bae8
> > > [    8.772329] ---[ end trace 6408481e551a85c9 ]---
> > > ...
> > 
> > With which kernel version did you encounter this kernel panic? IDE CD-ROM
> > access works fine here from inside qemu with kernel v4.14.0-rc6.
> 
> I also compiled with kernel 4.14.0-rc6, and it failed.
> 
> Ubuntu 17.10, Fedora 27 do not have the same issue,
> because they disable ide and use ata piix to instead.
> 
> Ubuntu 17.10, kernel 4.13.0-16
> vim /boot/config-4.13.0-16-generic
> ...
> # CONFIG_IDE is not set
> CONFIG_ATA_PIIX=y
> [ ... ]
> What about your kernel config and boot log?

If I disable CONFIG_ATA and enable CONFIG_IDE I can reproduce this crash. As
you probably know request allocation follows one of these code paths with the
legacy block layer:

blk_get_request(q, op, gfp)
-> blk_old_get_request(q, op, gfp)
  -> get_request(q, op, bio, gfp)
    -> __get_request(rl, op, bio, gfp)
      -> blk_rq_init(q, rq)

generic_make_request(bio)
-> blk_queue_bio(q, bio)
  -> get_request(q, op, bio, gfp)
    -> __get_request(rl, op, bio, gfp)
      -> blk_rq_init(q, rq)

ide_initialize_rq() gets called from inside blk_get_request() but does not get
called in the second case. One possible solution for this kernel panic is to
call .initialize_rq_fn() also for filesystem requests. However, a patch that
realized this got rejected (see also
https://www.mail-archive.com/linux-block@vger.kernel.org/msg12160.html).

How about replacing your patch with something like the patch below? The advantages
of the patch below are:
- memset(req->cmd, 0, BLK_MAX_CDB) is called once instead of twice.
- The sense buffer pointer gets initialized.

The ide_initialize_rq() implementation is as follows:

static void ide_initialize_rq(struct request *rq)
{
	struct ide_request *req = blk_mq_rq_to_pdu(rq);

	scsi_req_init(&req->sreq);
	req->sreq.sense = req->sense;
}



Thanks,

Bart.

Comments

Hongxu Jia Nov. 2, 2017, 1:47 a.m. UTC | #1
On 2017年11月02日 00:04, Bart Van Assche wrote:
> On Wed, 2017-11-01 at 09:44 +0800, Hongxu Jia wrote:
>> On 2017年10月31日 23:23, Bart Van Assche wrote:
>>> On Tue, 2017-10-31 at 15:39 +0800, Hongxu Jia wrote:
>>>> Since we split the scsi_request out of struct request, while the
>>>> standard prep_rq_fn builds 10 byte cmds, it missed to invoke
>>>> scsi_req_init() to initialize certain fields of a scsi_request
>>>> structure (.__cmd[], .cmd, .cmd_len and .sense_len but no other
>>>> members of struct scsi_request).
>>>>
>>>> An example panic on virtual machines (qemu/virtualbox) to boot
>>>> from IDE cdrom:
>>>> ...
>>>> [    8.754381] Call Trace:
>>>> [    8.755419]  blk_peek_request+0x182/0x2e0
>>>> [    8.755863]  blk_fetch_request+0x1c/0x40
>>>> [    8.756148]  ? ktime_get+0x40/0xa0
>>>> [    8.756385]  do_ide_request+0x37d/0x660
>>>> [    8.756704]  ? cfq_group_service_tree_add+0x98/0xc0
>>>> [    8.757011]  ? cfq_service_tree_add+0x1e5/0x2c0
>>>> [    8.757313]  ? ktime_get+0x40/0xa0
>>>> [    8.757544]  __blk_run_queue+0x3d/0x60
>>>> [    8.757837]  queue_unplugged+0x2f/0xc0
>>>> [    8.758088]  blk_flush_plug_list+0x1f4/0x240
>>>> [    8.758362]  blk_finish_plug+0x2c/0x40
>>>> ...
>>>> [    8.770906] RIP: ide_cdrom_prep_fn+0x63/0x180 RSP: ffff92aec018bae8
>>>> [    8.772329] ---[ end trace 6408481e551a85c9 ]---
>>>> ...
>>> With which kernel version did you encounter this kernel panic? IDE CD-ROM
>>> access works fine here from inside qemu with kernel v4.14.0-rc6.
>> I also compiled with kernel 4.14.0-rc6, and it failed.
>>
>> Ubuntu 17.10, Fedora 27 do not have the same issue,
>> because they disable ide and use ata piix to instead.
>>
>> Ubuntu 17.10, kernel 4.13.0-16
>> vim /boot/config-4.13.0-16-generic
>> ...
>> # CONFIG_IDE is not set
>> CONFIG_ATA_PIIX=y
>> [ ... ]
>> What about your kernel config and boot log?
> If I disable CONFIG_ATA and enable CONFIG_IDE I can reproduce this crash. As
> you probably know request allocation follows one of these code paths with the
> legacy block layer:
>
> blk_get_request(q, op, gfp)
> -> blk_old_get_request(q, op, gfp)
>    -> get_request(q, op, bio, gfp)
>      -> __get_request(rl, op, bio, gfp)
>        -> blk_rq_init(q, rq)
>
> generic_make_request(bio)
> -> blk_queue_bio(q, bio)
>    -> get_request(q, op, bio, gfp)
>      -> __get_request(rl, op, bio, gfp)
>        -> blk_rq_init(q, rq)
>
> ide_initialize_rq() gets called from inside blk_get_request() but does not get
> called in the second case. One possible solution for this kernel panic is to
> call .initialize_rq_fn() also for filesystem requests. However, a patch that
> realized this got rejected (see also
> https://www.mail-archive.com/linux-block@vger.kernel.org/msg12160.html).
>
> How about replacing your patch with something like the patch below? The advantages
> of the patch below are:
> - memset(req->cmd, 0, BLK_MAX_CDB) is called once instead of twice.
> - The sense buffer pointer gets initialized.

Thanks, It makes sense, v2 incoming.

//Hongxu

> The ide_initialize_rq() implementation is as follows:
>
> static void ide_initialize_rq(struct request *rq)
> {
> 	struct ide_request *req = blk_mq_rq_to_pdu(rq);
>
> 	scsi_req_init(&req->sreq);
> 	req->sreq.sense = req->sense;
> }
>
>
> diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
> index 81e18f9628d0..09b5bdb1af64 100644
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@ -1328,7 +1328,7 @@ static int ide_cdrom_prep_fs(struct request_queue *q, struct request *rq)
>   	unsigned long blocks = blk_rq_sectors(rq) / (hard_sect >> 9);
>   	struct scsi_request *req = scsi_req(rq);
>   
> -	memset(req->cmd, 0, BLK_MAX_CDB);
> +	q->initialize_rq_fn(rq);
>   
>   	if (rq_data_dir(rq) == READ)
>   		req->cmd[0] = GPCMD_READ_10;
>
> Thanks,
>
> Bart.
diff mbox

Patch

diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
index 81e18f9628d0..09b5bdb1af64 100644
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -1328,7 +1328,7 @@  static int ide_cdrom_prep_fs(struct request_queue *q, struct request *rq)
 	unsigned long blocks = blk_rq_sectors(rq) / (hard_sect >> 9);
 	struct scsi_request *req = scsi_req(rq);
 
-	memset(req->cmd, 0, BLK_MAX_CDB);
+	q->initialize_rq_fn(rq);
 
 	if (rq_data_dir(rq) == READ)
 		req->cmd[0] = GPCMD_READ_10;