diff mbox

libiscsi task cancellation

Message ID CAJSP0QVn2Mxw=BDGaBerLQp3-RDtHaBjPDWe1je9pTw_T7i44A@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Stefan Hajnoczi Feb. 8, 2018, 3:08 p.m. UTC
On Thu, Feb 8, 2018 at 2:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
>> Now on to libiscsi:
>>
>> The iscsi_task_mgmt_async() API documentation says:
>>
>>   * abort_task will also cancel the scsi task. The callback for the
>> scsi task will be invoked with
>>   *            SCSI_STATUS_CANCELLED
>>
>> I see that the ABORT TASK TMF response invokes the user's
>> iscsi_task_mgmt_async() callback but not the command callback.  I'm
>> not sure how the command callback is invoked with
>> SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
>> that response.
>>
>> Is libiscsi honoring its iscsi_task_mgmt_async() contract?
>
> No, and QEMU is assuming the "wrong" behavior:
>
> static void
> iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
>                     void *private_data)
> {
>     IscsiAIOCB *acb = private_data;
>
>     acb->status = -ECANCELED;
>     iscsi_schedule_bh(acb);
> }

FWIW My recent use-after-free fix series is still pending review:
https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg00570.html

It changes the code and fixes a QEMU bug, but it doesn't address the
iscsi_task_mgmt_async() issue we're discussing now.

I think users *can* work around the iscsi_task_mgmt_async() issue as follows:

If the command callback has not been invoked yet when the ABORT TASK
TMF callback is invoked, call iscsi_scsi_cancel_task() so that
libiscsi invokes the command callback with SCSI_STATUS_CANCELLED and
frees the PDU.

Here is how that looks in QEMU:

Comments

Stefan Hajnoczi Feb. 8, 2018, 3:13 p.m. UTC | #1
On Thu, Feb 8, 2018 at 3:08 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Thu, Feb 8, 2018 at 2:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
> I think users *can* work around the iscsi_task_mgmt_async() issue as follows:
>
> If the command callback has not been invoked yet when the ABORT TASK
> TMF callback is invoked, call iscsi_scsi_cancel_task() so that
> libiscsi invokes the command callback with SCSI_STATUS_CANCELLED and
> frees the PDU.
>
> Here is how that looks in QEMU:
>
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 8140baac15..7b4b43dc55 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -290,8 +290,12 @@ iscsi_abort_task_cb(struct iscsi_context *iscsi,
> int status, void *command_data,
>  {
>      IscsiAIOCB *acb = private_data;
>
> -    acb->status = -ECANCELED;
> -    iscsi_schedule_bh(acb);
> +    /* If the command callback hasn't been called yet, drop the task */
> +    if (!acb->bh) {
> +        /* Call iscsi_aio_ioctl_cb() with SCSI_STATUS_CANCELLED */
> +        iscsi_scsi_cancel_task(iscsi, acb->task);
> +    }
> +
>      qemu_aio_unref(acb); /* acquired in iscsi_aio_cancel() */
>  }

By the way, this is on top of
https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg00570.html.

Stefan
Felipe Franciosi Feb. 14, 2018, 4:12 p.m. UTC | #2
> On 8 Feb 2018, at 15:08, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> 
> On Thu, Feb 8, 2018 at 2:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
>>> Now on to libiscsi:
>>> 
>>> The iscsi_task_mgmt_async() API documentation says:
>>> 
>>>  * abort_task will also cancel the scsi task. The callback for the
>>> scsi task will be invoked with
>>>  *            SCSI_STATUS_CANCELLED
>>> 
>>> I see that the ABORT TASK TMF response invokes the user's
>>> iscsi_task_mgmt_async() callback but not the command callback.  I'm
>>> not sure how the command callback is invoked with
>>> SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
>>> that response.
>>> 
>>> Is libiscsi honoring its iscsi_task_mgmt_async() contract?
>> 
>> No, and QEMU is assuming the "wrong" behavior:
>> 
>> static void
>> iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
>>                    void *private_data)
>> {
>>    IscsiAIOCB *acb = private_data;
>> 
>>    acb->status = -ECANCELED;
>>    iscsi_schedule_bh(acb);
>> }
> 
> FWIW My recent use-after-free fix series is still pending review:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2018-2D02_msg00570.html&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=CCrJKVC5zGot8PrnI-iYe00MdX4pgdQfMRigp14Ptmk&m=E9W0Q_a83TeT4c8PEn--G-Dk9dZ5UGO2CZRZzal0Sak&s=BC2keaBG9mG3yYI_IYPwAoA3_I69KmPZordXnDguF0I&e=
> 
> It changes the code and fixes a QEMU bug, but it doesn't address the
> iscsi_task_mgmt_async() issue we're discussing now.
> 
> I think users *can* work around the iscsi_task_mgmt_async() issue as follows:
> 
> If the command callback has not been invoked yet when the ABORT TASK
> TMF callback is invoked, call iscsi_scsi_cancel_task() so that
> libiscsi invokes the command callback with SCSI_STATUS_CANCELLED and
> frees the PDU.
> 
> Here is how that looks in QEMU:
> 
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 8140baac15..7b4b43dc55 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -290,8 +290,12 @@ iscsi_abort_task_cb(struct iscsi_context *iscsi,
> int status, void *command_data,
> {
>     IscsiAIOCB *acb = private_data;
> 
> -    acb->status = -ECANCELED;
> -    iscsi_schedule_bh(acb);
> +    /* If the command callback hasn't been called yet, drop the task */
> +    if (!acb->bh) {
> +        /* Call iscsi_aio_ioctl_cb() with SCSI_STATUS_CANCELLED */
> +        iscsi_scsi_cancel_task(iscsi, acb->task);
> +    }
> +
>     qemu_aio_unref(acb); /* acquired in iscsi_aio_cancel() */
> }

Reviewed-by: Felipe Franciosi <felipe@nutanix.com>
Tested-by: Sreejith Mohanan <sreejit.mohanan@nutanix.com>

We've been over this and Sreejith confirmed this fixes the issue we are seeing. Nevertheless, I think we should continue this conversation to clarify what are the expectations with regards to requests that may still be inflight.

F.
diff mbox

Patch

diff --git a/block/iscsi.c b/block/iscsi.c
index 8140baac15..7b4b43dc55 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -290,8 +290,12 @@  iscsi_abort_task_cb(struct iscsi_context *iscsi,
int status, void *command_data,
 {
     IscsiAIOCB *acb = private_data;

-    acb->status = -ECANCELED;
-    iscsi_schedule_bh(acb);
+    /* If the command callback hasn't been called yet, drop the task */
+    if (!acb->bh) {
+        /* Call iscsi_aio_ioctl_cb() with SCSI_STATUS_CANCELLED */
+        iscsi_scsi_cancel_task(iscsi, acb->task);
+    }
+
     qemu_aio_unref(acb); /* acquired in iscsi_aio_cancel() */
 }