Message ID | 20170711170821.24669-1-ppandit@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/11/2017 01:08 PM, P J P wrote: > From: Prasad J Pandit <pjp@fedoraproject.org> > > When processing ATA_CACHE_FLUSH ide controller command, > BlockDriverState object could be null. Add check to avoid > null pointer dereference. > This happens through 0xE7, what QEMU calls WIN_CACHE_FLUSH and described in ATA8 ACS3 as "FLUSH CACHE" The core of the matter here is that any ATA device associated with a BlockBackend that has no media inserted will accept the command and call blk_aio_flush, which will later fail because blk_aio_prwv assumes that the BlockBackend it was given definitely has a BlockDriverState attached. The associated bdrv_inc_in_flight causes the null pointer dereference. It's not immediately clear to me what the right fix is here: (1) Should blk_ functions not assume they will have a BlockDriverState? (i.e. should an attempted blk_flush on an empty blk just succeed quietly, or is that inherently an error?) (2) Should ATA reject such commands entirely? (3) Both? ATA says that CDROM drives /may/ accept flush cache, but it doesn't really say what it has to do about if there's no media. This is for flushing write cache, after all, and our media are RO for CDROMs. We could possibly make flush cache a NOP for CDROMs entirely, or I can remove our support for the command, as it is "optional" and I can't find any information on what, if anything, it should actually do. Grumble Grumble, ATA specs. > Reported-by: Kieron Shorrock <kshorrock@paloaltonetworks.com> Thank you for your report. > Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> > --- > block/block-backend.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/block/block-backend.c b/block/block-backend.c > index 0df3457..6a543a4 100644 > --- a/block/block-backend.c > +++ b/block/block-backend.c > @@ -1168,8 +1168,13 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes, > { > BlkAioEmAIOCB *acb; > Coroutine *co; > + BlockDriverState *bs = blk_bs(blk); > > - bdrv_inc_in_flight(blk_bs(blk)); > + if (!bs) { > + return NULL; > + } > + This hotfix as posted is inappropriate I think, because this path has never been able to return NULL previously and many callers don't check to see if their command was registered successfully, and rely on the callback to be executed. > + bdrv_inc_in_flight(bs); > acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque); > acb->rwco = (BlkRwCo) { > .blk = blk, > @@ -1182,7 +1187,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes, > acb->has_returned = false; > > co = qemu_coroutine_create(co_entry, acb); > - bdrv_coroutine_enter(blk_bs(blk), co); > + bdrv_coroutine_enter(bs, co); > > acb->has_returned = true; > if (acb->rwco.ret != NOT_DONE) { >
On Mon, Jul 17, 2017 at 11:56:48AM -0400, John Snow wrote: > On 07/11/2017 01:08 PM, P J P wrote: > > From: Prasad J Pandit <pjp@fedoraproject.org> > > > > When processing ATA_CACHE_FLUSH ide controller command, > > BlockDriverState object could be null. Add check to avoid > > null pointer dereference. > > > > This happens through 0xE7, what QEMU calls WIN_CACHE_FLUSH and described > in ATA8 ACS3 as "FLUSH CACHE" > > The core of the matter here is that any ATA device associated with a > BlockBackend that has no media inserted will accept the command and call > blk_aio_flush, which will later fail because blk_aio_prwv assumes that > the BlockBackend it was given definitely has a BlockDriverState attached. > > The associated bdrv_inc_in_flight causes the null pointer dereference. > > It's not immediately clear to me what the right fix is here: > > (1) Should blk_ functions not assume they will have a BlockDriverState? > (i.e. should an attempted blk_flush on an empty blk just succeed > quietly, or is that inherently an error?) Hmm...BlockDriverState still has bdrv_is_inserted() even though BlockBackend->root can be NULL? CCing Markus in case he has thoughts on the BB/BDS split. I find it weird that block-backend.c calls bdrv_inc_in_flight() and then bdrv_co_flush() will call it again. Perhaps we need to do this so that blk_drain() works correctly but it's odd that they share the same counter variable.
On 21/07/2017 17:47, Stefan Hajnoczi wrote: > Hmm...BlockDriverState still has bdrv_is_inserted() even though > BlockBackend->root can be NULL? CCing Markus in case he has thoughts on > the BB/BDS split. > > I find it weird that block-backend.c calls bdrv_inc_in_flight() and then > bdrv_co_flush() will call it again. Perhaps we need to do this so that > blk_drain() works correctly but it's odd that they share the same > counter variable. See here: https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02305.html > I need to count requests at the BB level because the blk_aio_* > operations have a separate bottom half that is invoked if either 1) they > never reach BDS (because of an error); or 2) the bdrv_co_* call > completes without yielding. The count must be >0 when blk_aio_* > returns, or bdrv_drain (and thus blk_drain) won't loop. Because > bdrv_drain and blk_drain are conflated, the counter must be the BDS one. > > In turn, the BDS counter is needed because of the lack of isolated > sections. The right design would be for blk_isolate_begin to call > blk_drain on *other* BlockBackends reachable in a child-to-parent visit. > Instead, until that is implemented, we have bdrv_drained_begin that > emulates that through the same-named callback, followed by a > parent-to-child bdrv_drain that is almost always unnecessary. The full explanation of the long-term plans and what "isolated section" means is at https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02016.html. (Unfortunately, since then we've reintroduced the "double aio_poll" hack in BDRV_POLL_WHILE...). Paolo
On 07/23/2017 10:36 AM, Paolo Bonzini wrote: > On 21/07/2017 17:47, Stefan Hajnoczi wrote: >> Hmm...BlockDriverState still has bdrv_is_inserted() even though >> BlockBackend->root can be NULL? CCing Markus in case he has thoughts on >> the BB/BDS split. >> >> I find it weird that block-backend.c calls bdrv_inc_in_flight() and then >> bdrv_co_flush() will call it again. Perhaps we need to do this so that >> blk_drain() works correctly but it's odd that they share the same >> counter variable. > > See here: > https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02305.html > >> I need to count requests at the BB level because the blk_aio_* >> operations have a separate bottom half that is invoked if either 1) they >> never reach BDS (because of an error); or 2) the bdrv_co_* call >> completes without yielding. The count must be >0 when blk_aio_* >> returns, or bdrv_drain (and thus blk_drain) won't loop. Because >> bdrv_drain and blk_drain are conflated, the counter must be the BDS one. >> >> In turn, the BDS counter is needed because of the lack of isolated >> sections. The right design would be for blk_isolate_begin to call >> blk_drain on *other* BlockBackends reachable in a child-to-parent visit. >> Instead, until that is implemented, we have bdrv_drained_begin that >> emulates that through the same-named callback, followed by a >> parent-to-child bdrv_drain that is almost always unnecessary. OK, so you have some justifications for why it needs to be at the BB level... > > The full explanation of the long-term plans and what "isolated section" > means is at > https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02016.html. > (Unfortunately, since then we've reintroduced the "double aio_poll" hack > in BDRV_POLL_WHILE...). > > Paolo > I may need some nudging towards understanding what the right solution here is, though. Should the blk_aio_flush assume that there always is a root BDS? should it not assume that? It's difficult for me to understand right now if the bug is in the expectation for the blk_ functions and the caller should be amended, or if you need changes to the way the blk_ functions are trying to increment a counter that doesn't exist. I can handle the former fairly easily; if it's the latter, I'm afraid it's stuck in the middle of some of your changes and I'd need a stronger hint. John
On 01/08/2017 02:14, John Snow wrote: > I may need some nudging towards understanding what the right solution > here is, though. Should the blk_aio_flush assume that there always is a > root BDS? should it not assume that? I think blk_aio_flush is not special. If there is no root BDS, either you return -ENOMEDIUM, or you crash. But all functions should be doing the same. The former makes sense, but right now blk_prwv for one are crashing if there is no root BDS so the minimum patch would fix the caller rather than blk_aio_flush. Paolo > It's difficult for me to understand right now if the bug is in the > expectation for the blk_ functions and the caller should be amended, or > if you need changes to the way the blk_ functions are trying to > increment a counter that doesn't exist. > > I can handle the former fairly easily; if it's the latter, I'm afraid > it's stuck in the middle of some of your changes and I'd need a stronger > hint.
Am 01.08.2017 um 10:35 hat Paolo Bonzini geschrieben: > On 01/08/2017 02:14, John Snow wrote: > > I may need some nudging towards understanding what the right solution > > here is, though. Should the blk_aio_flush assume that there always is a > > root BDS? should it not assume that? > > I think blk_aio_flush is not special. If there is no root BDS, either > you return -ENOMEDIUM, or you crash. But all functions should be doing > the same. The intended semantics is that they return -ENOMEDIUM (or fail at least). This is how things have always worked, and that it crashes now because of the bdrv_inc_in_flight() was not an intentional change, but simply a bug in the patch. > The former makes sense, but right now blk_prwv for one are crashing if > there is no root BDS so the minimum patch would fix the caller rather > than blk_aio_flush. The synchronous versions don't crash, and bdrv_aio_prwv() would fix all cases if bdrv_inc_in_flight() were moved inside the coroutine; probably right before blk_aio_complete(). This would be more consistent with how the synchronous versions work, too, increasing the in-flight count only by 1 rather than 2 for an AIO request. Kevin
diff --git a/block/block-backend.c b/block/block-backend.c index 0df3457..6a543a4 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1168,8 +1168,13 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes, { BlkAioEmAIOCB *acb; Coroutine *co; + BlockDriverState *bs = blk_bs(blk); - bdrv_inc_in_flight(blk_bs(blk)); + if (!bs) { + return NULL; + } + + bdrv_inc_in_flight(bs); acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque); acb->rwco = (BlkRwCo) { .blk = blk, @@ -1182,7 +1187,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes, acb->has_returned = false; co = qemu_coroutine_create(co_entry, acb); - bdrv_coroutine_enter(blk_bs(blk), co); + bdrv_coroutine_enter(bs, co); acb->has_returned = true; if (acb->rwco.ret != NOT_DONE) {