diff mbox

qga: implement guest-file-ioctl

Message ID 1485943606-13998-1-git-send-email-lprosek@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ladi Prosek Feb. 1, 2017, 10:06 a.m. UTC
Analogous to guest-file-read and guest-file-write, this commit adds
support for issuing IOCTLs to files in the guest. With the goal of
abstracting away the differences between Posix ioctl() and Win32
DeviceIoControl() to provide one unified API, the schema distinguishes
between input and output buffer sizes (as required by Win32) and
allows the caller to supply either a 'buffer', pointer to which will
be passed to the Posix ioctl(), or an integer argument, passed to
ioctl() directly.

Examples:

To issue an IOCTL that takes const int * as its argument, one would
call guest-file-ioctl with:
out-count = 0
in-buf-b64 = <base64-encoded binary representation of an integer>

To issue an IOCTL that takes int * as its argument, one would use:
out-count = sizeof(int)
in-buf-b64 = <empty string if the argument is output-only>
and the returned GuestFileIOCTL will contain:
count = sizeof(int)
buf-b64 = <base64-encoded binary representation of an integer>

To issue an IOCTL that takes int as its argument, one would use:
out-count = 0
in-buf-b64 = <empty string>
int-arg = <an integer>
This last example will work only with the Posix guest agent as
Win32 always uses indirect input and output data.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
---
 qga/commands-posix.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 qga/commands-win32.c | 66 ++++++++++++++++++++++++++++++++++++++++++++
 qga/qapi-schema.json | 45 ++++++++++++++++++++++++++++++
 3 files changed, 189 insertions(+)

Comments

Daniel P. Berrangé Feb. 1, 2017, 10:20 a.m. UTC | #1
On Wed, Feb 01, 2017 at 11:06:46AM +0100, Ladi Prosek wrote:
> Analogous to guest-file-read and guest-file-write, this commit adds
> support for issuing IOCTLs to files in the guest. With the goal of
> abstracting away the differences between Posix ioctl() and Win32
> DeviceIoControl() to provide one unified API, the schema distinguishes
> between input and output buffer sizes (as required by Win32) and
> allows the caller to supply either a 'buffer', pointer to which will
> be passed to the Posix ioctl(), or an integer argument, passed to
> ioctl() directly.

What is the intended usage scenario for this ?

The size of arguments that need to be provided to ioctl()s vary on
the architecture of the guest kernel that's running, which cannot be
assumed to be the same as the architecture of the QEMU binary. ie
you can be running i686 kernel in an x86_64 QEMU.  So it feels like
it is going to be hard for applications to reliably use this feature
unless they have more information about the guest OS, that is not
currently provided by QEMU guest agent. So it feels like this design
is not likely to be satisfactory to me.

Regards,
Daniel
Ladi Prosek Feb. 1, 2017, 10:50 a.m. UTC | #2
On Wed, Feb 1, 2017 at 11:20 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
> On Wed, Feb 01, 2017 at 11:06:46AM +0100, Ladi Prosek wrote:
>> Analogous to guest-file-read and guest-file-write, this commit adds
>> support for issuing IOCTLs to files in the guest. With the goal of
>> abstracting away the differences between Posix ioctl() and Win32
>> DeviceIoControl() to provide one unified API, the schema distinguishes
>> between input and output buffer sizes (as required by Win32) and
>> allows the caller to supply either a 'buffer', pointer to which will
>> be passed to the Posix ioctl(), or an integer argument, passed to
>> ioctl() directly.
>
> What is the intended usage scenario for this ?

My specific case is extracting a piece of data from Windows guests.
Guest driver exposes a file object with a well-known IOCTL code to
return a data structure from the kernel.

> The size of arguments that need to be provided to ioctl()s vary on
> the architecture of the guest kernel that's running, which cannot be
> assumed to be the same as the architecture of the QEMU binary. ie
> you can be running i686 kernel in an x86_64 QEMU.  So it feels like
> it is going to be hard for applications to reliably use this feature
> unless they have more information about the guest OS, that is not
> currently provided by QEMU guest agent. So it feels like this design
> is not likely to be satisfactory to me.

I can turn this around and say that guest-file-read and
guest-file-write shouldn't exist because applications can't reliably
know what the format of files on the guest is.

This is just a transport, it doesn't need to understand the data it's
transporting. Yes, I get it that ioctl tends to be associated with
system-y things with tighter ties to the OS, bitness, endianness and
so on. But, it doesn't have to be. In my case I chose ioctl as opposed
to regular file I/O because what I'm reading is not a stream in
nature, it's a fixed size data structure.

Thanks!
Ladi

> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
Daniel P. Berrangé Feb. 1, 2017, 11:03 a.m. UTC | #3
On Wed, Feb 01, 2017 at 11:50:43AM +0100, Ladi Prosek wrote:
> On Wed, Feb 1, 2017 at 11:20 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
> > On Wed, Feb 01, 2017 at 11:06:46AM +0100, Ladi Prosek wrote:
> >> Analogous to guest-file-read and guest-file-write, this commit adds
> >> support for issuing IOCTLs to files in the guest. With the goal of
> >> abstracting away the differences between Posix ioctl() and Win32
> >> DeviceIoControl() to provide one unified API, the schema distinguishes
> >> between input and output buffer sizes (as required by Win32) and
> >> allows the caller to supply either a 'buffer', pointer to which will
> >> be passed to the Posix ioctl(), or an integer argument, passed to
> >> ioctl() directly.
> >
> > What is the intended usage scenario for this ?
> 
> My specific case is extracting a piece of data from Windows guests.
> Guest driver exposes a file object with a well-known IOCTL code to
> return a data structure from the kernel.

Please provide more information about what you're trying to do.

If we can understand the full details it might suggest a different
approach that exposing a generic ioctl passthrough facility.

> > The size of arguments that need to be provided to ioctl()s vary on
> > the architecture of the guest kernel that's running, which cannot be
> > assumed to be the same as the architecture of the QEMU binary. ie
> > you can be running i686 kernel in an x86_64 QEMU.  So it feels like
> > it is going to be hard for applications to reliably use this feature
> > unless they have more information about the guest OS, that is not
> > currently provided by QEMU guest agent. So it feels like this design
> > is not likely to be satisfactory to me.
> 
> I can turn this around and say that guest-file-read and
> guest-file-write shouldn't exist because applications can't reliably
> know what the format of files on the guest is.

No that's not the same thing at all. From the POV of the QEMU API, the
contents of the files do not matter - it is just a byte stream and the
guest agent API lets you read it in the same way no matter what is in
the files, or what OS is running. There's no need to know anything about
the guest OS in order to successfully read/write the entire file.

The ioctl design you've proposed here is different because you need to
know precise operating system details before you can use the API. I
think that's really flawed.

It would be possible to design a ioctl API that is more usable if you
didn't try to do generic passthrough of arbitrary ioctl commands. Instead
provide a QAPI schema that uses a union to provide strongly typed arguments
for the ioctl commands you care about using. Or completely separate QAPI
commands instead of multiplexing everything into an ioctl command.

Regards,
Daniel
Ladi Prosek Feb. 1, 2017, 1:41 p.m. UTC | #4
On Wed, Feb 1, 2017 at 12:03 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
> On Wed, Feb 01, 2017 at 11:50:43AM +0100, Ladi Prosek wrote:
>> On Wed, Feb 1, 2017 at 11:20 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
>> > On Wed, Feb 01, 2017 at 11:06:46AM +0100, Ladi Prosek wrote:
>> >> Analogous to guest-file-read and guest-file-write, this commit adds
>> >> support for issuing IOCTLs to files in the guest. With the goal of
>> >> abstracting away the differences between Posix ioctl() and Win32
>> >> DeviceIoControl() to provide one unified API, the schema distinguishes
>> >> between input and output buffer sizes (as required by Win32) and
>> >> allows the caller to supply either a 'buffer', pointer to which will
>> >> be passed to the Posix ioctl(), or an integer argument, passed to
>> >> ioctl() directly.
>> >
>> > What is the intended usage scenario for this ?
>>
>> My specific case is extracting a piece of data from Windows guests.
>> Guest driver exposes a file object with a well-known IOCTL code to
>> return a data structure from the kernel.
>
> Please provide more information about what you're trying to do.
>
> If we can understand the full details it might suggest a different
> approach that exposing a generic ioctl passthrough facility.

The end goal is to be able to create a Window crash dump file from a
running (or crashed, but running is more interesting because Windows
can't do that by itself) Windows VM. To do that without resorting to
hacks, the host application driving this needs to get the crash dump
header, which Windows provides in its KeInitializeCrashDumpHeader
kernel API.

I believe that the most natural way to do this is to have
1) a driver installed in the guest providing a way to call
KeInitializeCrashDumpHeader from user space
2) an agent in the guest, running in user space, calling the driver
and passing the result back to the host

Now 2) may as well be an existing agent, such as the QEMU guest agent,
and that's why I am here :)

KeInitializeCrashDumpHeader returns an opaque byte array, happens to
be 8192 bytes at the moment. My first choice for the kernel-user
interface in the guest is IOCTL because what I'm trying to get across
is a block, a "datagram", not a stream, and I have the option for
easily adding more functionality later by adding more IOCTL codes with
the file object still representing "the driver".

I could use regular file I/O as well. I would either have to devise a
protocol for talking to the driver, have a way of delimiting messages,
re-syncing the channel etc., or make a slight semantic shift and
instead of the file object representing the driver, it would represent
this one particular function of the driver. Opening the file and
reading from it until eof would yield the crash dump header.

>> > The size of arguments that need to be provided to ioctl()s vary on
>> > the architecture of the guest kernel that's running, which cannot be
>> > assumed to be the same as the architecture of the QEMU binary. ie
>> > you can be running i686 kernel in an x86_64 QEMU.  So it feels like
>> > it is going to be hard for applications to reliably use this feature
>> > unless they have more information about the guest OS, that is not
>> > currently provided by QEMU guest agent. So it feels like this design
>> > is not likely to be satisfactory to me.
>>
>> I can turn this around and say that guest-file-read and
>> guest-file-write shouldn't exist because applications can't reliably
>> know what the format of files on the guest is.
>
> No that's not the same thing at all. From the POV of the QEMU API, the
> contents of the files do not matter - it is just a byte stream and the
> guest agent API lets you read it in the same way no matter what is in
> the files, or what OS is running. There's no need to know anything about
> the guest OS in order to successfully read/write the entire file.

Ok, so there are two different aspects that should probably be
separated. The actual semantics of IOCTL calls is equivalent to the
semantics of files in general. For files it's stream in, stream out
(and seeking and such). For IOCTL it's a buffer in, buffer out (and a
return code). The content is file specific, IOCTL code specific,
whatever. Definitely not guest agent's business to interpret it. Here
I think my analogy holds.

> The ioctl design you've proposed here is different because you need to
> know precise operating system details before you can use the API. I
> think that's really flawed.

Yes, maybe. Maybe the concept of the IOCTL syscall is just too
different across the guest operating systems supported by the agent. I
tried to abstract away the differences between Posix and Windows.
Perhaps I didn't do it right or there's more OSes to worry about. Yes,
in theory the data passed to or from IOCTL may contain pointers and
not be easily remotable. But it's not common. Files can also be opened
with all kinds of obscure flags on different OSes, many of which are
not supported by guest-file-open.

> It would be possible to design a ioctl API that is more usable if you
> didn't try to do generic passthrough of arbitrary ioctl commands. Instead
> provide a QAPI schema that uses a union to provide strongly typed arguments
> for the ioctl commands you care about using. Or completely separate QAPI
> commands instead of multiplexing everything into an ioctl command.

I can add a QAPI command for my specific use case if it's acceptable,
that's not a problem. Although at that point I would probably just go
back to my plan b and use regular file I/O and guest-file-read. I just
wanted to be as generic as possible to benefit other use cases as well
and I ended up with what's basically my stab at RPC ioctl.

> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
Eric Blake Feb. 1, 2017, 2:43 p.m. UTC | #5
On 02/01/2017 07:41 AM, Ladi Prosek wrote:

> 
> Ok, so there are two different aspects that should probably be
> separated. The actual semantics of IOCTL calls is equivalent to the
> semantics of files in general. For files it's stream in, stream out
> (and seeking and such). For IOCTL it's a buffer in, buffer out (and a
> return code). The content is file specific, IOCTL code specific,
> whatever. Definitely not guest agent's business to interpret it. Here
> I think my analogy holds.

I don't.  ioctl() is a leaky abstraction - it exists as the blanket
catch-all do-anything-else-that-doesn't-have-a-dedicated-API.  It is, by
its very nature, non-portable and VERY hard to abstract - because there
is no one abstraction for every possible ioctl code.  I strongly agree
with Dan here that you are going to get nowhere by adding a qga-ioctl
command, because there is no SANE way that you can specify it in QAPI.
Rather, add specific commands for the specific actions you want to
expose, and if the implementation of those specific commands uses a
particular ioctl() under the hood, fine.  But don't directly expose
ioctl() over the wire.


>> It would be possible to design a ioctl API that is more usable if you
>> didn't try to do generic passthrough of arbitrary ioctl commands. Instead
>> provide a QAPI schema that uses a union to provide strongly typed arguments
>> for the ioctl commands you care about using. Or completely separate QAPI
>> commands instead of multiplexing everything into an ioctl command.
> 
> I can add a QAPI command for my specific use case if it's acceptable,

Certainly more palatable, and more likely to gain my acceptance.

> that's not a problem. Although at that point I would probably just go
> back to my plan b and use regular file I/O and guest-file-read. I just
> wanted to be as generic as possible to benefit other use cases as well
> and I ended up with what's basically my stab at RPC ioctl.

The problem is that RPC ioctl is not sanely possible.
Ladi Prosek Feb. 1, 2017, 3:43 p.m. UTC | #6
On Wed, Feb 1, 2017 at 3:43 PM, Eric Blake <eblake@redhat.com> wrote:
> On 02/01/2017 07:41 AM, Ladi Prosek wrote:
>
>>
>> Ok, so there are two different aspects that should probably be
>> separated. The actual semantics of IOCTL calls is equivalent to the
>> semantics of files in general. For files it's stream in, stream out
>> (and seeking and such). For IOCTL it's a buffer in, buffer out (and a
>> return code). The content is file specific, IOCTL code specific,
>> whatever. Definitely not guest agent's business to interpret it. Here
>> I think my analogy holds.
>
> I don't.  ioctl() is a leaky abstraction - it exists as the blanket
> catch-all do-anything-else-that-doesn't-have-a-dedicated-API.  It is, by
> its very nature, non-portable and VERY hard to abstract - because there
> is no one abstraction for every possible ioctl code.  I strongly agree
> with Dan here that you are going to get nowhere by adding a qga-ioctl
> command, because there is no SANE way that you can specify it in QAPI.
> Rather, add specific commands for the specific actions you want to
> expose, and if the implementation of those specific commands uses a
> particular ioctl() under the hood, fine.  But don't directly expose
> ioctl() over the wire.

If I wasn't outnumbered and didn't have a feeling that I already lost
:) I would ask you to show me a couple of real-world IOCTLs that don't
fit the abstraction proposed in this patch. And trade it off against
enabling all the rest that would fit. And I would maybe also say that
done is better than perfect but that might mean crossing the line and
getting muself into trouble :p

But yeah, whoever thought that having ... in a syscall signature is a
good idea should be shamed forever.

>>> It would be possible to design a ioctl API that is more usable if you
>>> didn't try to do generic passthrough of arbitrary ioctl commands. Instead
>>> provide a QAPI schema that uses a union to provide strongly typed arguments
>>> for the ioctl commands you care about using. Or completely separate QAPI
>>> commands instead of multiplexing everything into an ioctl command.
>>
>> I can add a QAPI command for my specific use case if it's acceptable,
>
> Certainly more palatable, and more likely to gain my acceptance.

I'll look into it. Thanks!

>> that's not a problem. Although at that point I would probably just go
>> back to my plan b and use regular file I/O and guest-file-read. I just
>> wanted to be as generic as possible to benefit other use cases as well
>> and I ended up with what's basically my stab at RPC ioctl.
>
> The problem is that RPC ioctl is not sanely possible.
>
> --
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
Paolo Bonzini Feb. 1, 2017, 8:24 p.m. UTC | #7
On 01/02/2017 06:43, Eric Blake wrote:
>>> It would be possible to design a ioctl API that is more usable if you
>>> didn't try to do generic passthrough of arbitrary ioctl commands. Instead
>>> provide a QAPI schema that uses a union to provide strongly typed arguments
>>> for the ioctl commands you care about using. Or completely separate QAPI
>>> commands instead of multiplexing everything into an ioctl command.
>>
>> I can add a QAPI command for my specific use case if it's acceptable,
> 
> Certainly more palatable, and more likely to gain my acceptance.
> 
>> that's not a problem. Although at that point I would probably just go
>> back to my plan b and use regular file I/O and guest-file-read. I just
>> wanted to be as generic as possible to benefit other use cases as well
>> and I ended up with what's basically my stab at RPC ioctl.
> 
> The problem is that RPC ioctl is not sanely possible.

Windows ioctl (DeviceIoControl) is a little different from POSIX ioctl,
more RPC and less one-size-fits-all bucket for syscallish stuff.  It is
still a leaky abstraction though.

Paolo
Markus Armbruster Feb. 2, 2017, 11:05 a.m. UTC | #8
Ladi Prosek <lprosek@redhat.com> writes:

> On Wed, Feb 1, 2017 at 12:03 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
>> On Wed, Feb 01, 2017 at 11:50:43AM +0100, Ladi Prosek wrote:
>>> On Wed, Feb 1, 2017 at 11:20 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
>>> > On Wed, Feb 01, 2017 at 11:06:46AM +0100, Ladi Prosek wrote:
>>> >> Analogous to guest-file-read and guest-file-write, this commit adds
>>> >> support for issuing IOCTLs to files in the guest. With the goal of
>>> >> abstracting away the differences between Posix ioctl() and Win32
>>> >> DeviceIoControl() to provide one unified API, the schema distinguishes
>>> >> between input and output buffer sizes (as required by Win32) and
>>> >> allows the caller to supply either a 'buffer', pointer to which will
>>> >> be passed to the Posix ioctl(), or an integer argument, passed to
>>> >> ioctl() directly.
>>> >
>>> > What is the intended usage scenario for this ?
>>>
>>> My specific case is extracting a piece of data from Windows guests.
>>> Guest driver exposes a file object with a well-known IOCTL code to
>>> return a data structure from the kernel.
>>
>> Please provide more information about what you're trying to do.
>>
>> If we can understand the full details it might suggest a different
>> approach that exposing a generic ioctl passthrough facility.
>
> The end goal is to be able to create a Window crash dump file from a
> running (or crashed, but running is more interesting because Windows
> can't do that by itself) Windows VM. To do that without resorting to
> hacks, the host application driving this needs to get the crash dump
> header, which Windows provides in its KeInitializeCrashDumpHeader
> kernel API.
>
> I believe that the most natural way to do this is to have
> 1) a driver installed in the guest providing a way to call
> KeInitializeCrashDumpHeader from user space
> 2) an agent in the guest, running in user space, calling the driver
> and passing the result back to the host
>
> Now 2) may as well be an existing agent, such as the QEMU guest agent,
> and that's why I am here :)

Makes sense.

> KeInitializeCrashDumpHeader returns an opaque byte array, happens to
> be 8192 bytes at the moment. My first choice for the kernel-user
> interface in the guest is IOCTL because what I'm trying to get across
> is a block, a "datagram", not a stream, and I have the option for
> easily adding more functionality later by adding more IOCTL codes with
> the file object still representing "the driver".

ioctl() is the traditional way to do something like this.

In Linux, you might well be asked to do a pseudo-file in /sys instead.

> I could use regular file I/O as well. I would either have to devise a
> protocol for talking to the driver, have a way of delimiting messages,
> re-syncing the channel etc.,

Way too much trouble.

>                              or make a slight semantic shift and
> instead of the file object representing the driver, it would represent
> this one particular function of the driver. Opening the file and
> reading from it until eof would yield the crash dump header.

Or even simpler: read(fd, buf, n) always returns the first @n bytes of
the crash dump header.  If you want to detect a size that doesn't match
your expectations, pass a slightly larger buffer.

>>> > The size of arguments that need to be provided to ioctl()s vary on
>>> > the architecture of the guest kernel that's running, which cannot be
>>> > assumed to be the same as the architecture of the QEMU binary. ie
>>> > you can be running i686 kernel in an x86_64 QEMU.  So it feels like
>>> > it is going to be hard for applications to reliably use this feature
>>> > unless they have more information about the guest OS, that is not
>>> > currently provided by QEMU guest agent. So it feels like this design
>>> > is not likely to be satisfactory to me.
>>>
>>> I can turn this around and say that guest-file-read and
>>> guest-file-write shouldn't exist because applications can't reliably
>>> know what the format of files on the guest is.
>>
>> No that's not the same thing at all. From the POV of the QEMU API, the
>> contents of the files do not matter - it is just a byte stream and the
>> guest agent API lets you read it in the same way no matter what is in
>> the files, or what OS is running. There's no need to know anything about
>> the guest OS in order to successfully read/write the entire file.
>
> Ok, so there are two different aspects that should probably be
> separated. The actual semantics of IOCTL calls is equivalent to the
> semantics of files in general. For files it's stream in, stream out
> (and seeking and such). For IOCTL it's a buffer in, buffer out (and a
> return code). The content is file specific, IOCTL code specific,
> whatever. Definitely not guest agent's business to interpret it. Here
> I think my analogy holds.

"Everything is a file".  Many times we made something not a file we
regretted it later.

>> The ioctl design you've proposed here is different because you need to
>> know precise operating system details before you can use the API. I
>> think that's really flawed.
>
> Yes, maybe. Maybe the concept of the IOCTL syscall is just too
> different across the guest operating systems supported by the agent. I
> tried to abstract away the differences between Posix and Windows.
> Perhaps I didn't do it right or there's more OSes to worry about. Yes,
> in theory the data passed to or from IOCTL may contain pointers and
> not be easily remotable. But it's not common. Files can also be opened
> with all kinds of obscure flags on different OSes, many of which are
> not supported by guest-file-open.

ioctl() is an ugly low-level interface, which makes it awkward to pass
through QGA.

Your point that we strip away details of the file I/O interfaces for QGA
is valid.  I wouldn't rule out the possibility of a suitably limited
ioctl() pass through out of hand.  But we'd probably want more than one
use case to demonstrate that it's useful despite the limitations.

>> It would be possible to design a ioctl API that is more usable if you
>> didn't try to do generic passthrough of arbitrary ioctl commands. Instead
>> provide a QAPI schema that uses a union to provide strongly typed arguments
>> for the ioctl commands you care about using. Or completely separate QAPI
>> commands instead of multiplexing everything into an ioctl command.
>
> I can add a QAPI command for my specific use case if it's acceptable,
> that's not a problem. Although at that point I would probably just go
> back to my plan b and use regular file I/O and guest-file-read. I just
> wanted to be as generic as possible to benefit other use cases as well
> and I ended up with what's basically my stab at RPC ioctl.

I appreciate you trying to provide something that's generally useful
instead of just hacking up a one-off that works four your special
problem.

Of course, designing a generally useful thing can sometimes be more
trouble than it's worth.  Use your judgement.
Denis V. Lunev Feb. 6, 2017, 3:37 p.m. UTC | #9
On 02/01/2017 04:41 PM, Ladi Prosek wrote:
> On Wed, Feb 1, 2017 at 12:03 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
>> On Wed, Feb 01, 2017 at 11:50:43AM +0100, Ladi Prosek wrote:
>>> On Wed, Feb 1, 2017 at 11:20 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
>>>> On Wed, Feb 01, 2017 at 11:06:46AM +0100, Ladi Prosek wrote:
>>>>> Analogous to guest-file-read and guest-file-write, this commit adds
>>>>> support for issuing IOCTLs to files in the guest. With the goal of
>>>>> abstracting away the differences between Posix ioctl() and Win32
>>>>> DeviceIoControl() to provide one unified API, the schema distinguishes
>>>>> between input and output buffer sizes (as required by Win32) and
>>>>> allows the caller to supply either a 'buffer', pointer to which will
>>>>> be passed to the Posix ioctl(), or an integer argument, passed to
>>>>> ioctl() directly.
>>>> What is the intended usage scenario for this ?
>>> My specific case is extracting a piece of data from Windows guests.
>>> Guest driver exposes a file object with a well-known IOCTL code to
>>> return a data structure from the kernel.
>> Please provide more information about what you're trying to do.
>>
>> If we can understand the full details it might suggest a different
>> approach that exposing a generic ioctl passthrough facility.
> The end goal is to be able to create a Window crash dump file from a
> running (or crashed, but running is more interesting because Windows
> can't do that by itself) Windows VM. To do that without resorting to
> hacks, the host application driving this needs to get the crash dump
> header, which Windows provides in its KeInitializeCrashDumpHeader
> kernel API.
>
> I believe that the most natural way to do this is to have
> 1) a driver installed in the guest providing a way to call
> KeInitializeCrashDumpHeader from user space
> 2) an agent in the guest, running in user space, calling the driver
> and passing the result back to the host
>
> Now 2) may as well be an existing agent, such as the QEMU guest agent,
> and that's why I am here :)
>
> KeInitializeCrashDumpHeader returns an opaque byte array, happens to
> be 8192 bytes at the moment. My first choice for the kernel-user
> interface in the guest is IOCTL because what I'm trying to get across
> is a block, a "datagram", not a stream, and I have the option for
> easily adding more functionality later by adding more IOCTL codes with
> the file object still representing "the driver".
>
> I could use regular file I/O as well. I would either have to devise a
> protocol for talking to the driver, have a way of delimiting messages,
> re-syncing the channel etc., or make a slight semantic shift and
> instead of the file object representing the driver, it would represent
> this one particular function of the driver. Opening the file and
> reading from it until eof would yield the crash dump header.
I think that this is not as good as can be for the whole design of the
feature.
The problem here is that userspace starts toooooooooo late and is not
accessible when the guest BSODS and we do need a dump for analysis.

May be it worth to push this header to QEMU at boot time through virtio bus?

Den
Daniel P. Berrangé Feb. 6, 2017, 3:50 p.m. UTC | #10
On Mon, Feb 06, 2017 at 06:37:29PM +0300, Denis V. Lunev wrote:
> On 02/01/2017 04:41 PM, Ladi Prosek wrote:
> > On Wed, Feb 1, 2017 at 12:03 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
> >> On Wed, Feb 01, 2017 at 11:50:43AM +0100, Ladi Prosek wrote:
> >>> On Wed, Feb 1, 2017 at 11:20 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
> >>>> On Wed, Feb 01, 2017 at 11:06:46AM +0100, Ladi Prosek wrote:
> >>>>> Analogous to guest-file-read and guest-file-write, this commit adds
> >>>>> support for issuing IOCTLs to files in the guest. With the goal of
> >>>>> abstracting away the differences between Posix ioctl() and Win32
> >>>>> DeviceIoControl() to provide one unified API, the schema distinguishes
> >>>>> between input and output buffer sizes (as required by Win32) and
> >>>>> allows the caller to supply either a 'buffer', pointer to which will
> >>>>> be passed to the Posix ioctl(), or an integer argument, passed to
> >>>>> ioctl() directly.
> >>>> What is the intended usage scenario for this ?
> >>> My specific case is extracting a piece of data from Windows guests.
> >>> Guest driver exposes a file object with a well-known IOCTL code to
> >>> return a data structure from the kernel.
> >> Please provide more information about what you're trying to do.
> >>
> >> If we can understand the full details it might suggest a different
> >> approach that exposing a generic ioctl passthrough facility.
> > The end goal is to be able to create a Window crash dump file from a
> > running (or crashed, but running is more interesting because Windows
> > can't do that by itself) Windows VM. To do that without resorting to
> > hacks, the host application driving this needs to get the crash dump
> > header, which Windows provides in its KeInitializeCrashDumpHeader
> > kernel API.
> >
> > I believe that the most natural way to do this is to have
> > 1) a driver installed in the guest providing a way to call
> > KeInitializeCrashDumpHeader from user space
> > 2) an agent in the guest, running in user space, calling the driver
> > and passing the result back to the host
> >
> > Now 2) may as well be an existing agent, such as the QEMU guest agent,
> > and that's why I am here :)
> >
> > KeInitializeCrashDumpHeader returns an opaque byte array, happens to
> > be 8192 bytes at the moment. My first choice for the kernel-user
> > interface in the guest is IOCTL because what I'm trying to get across
> > is a block, a "datagram", not a stream, and I have the option for
> > easily adding more functionality later by adding more IOCTL codes with
> > the file object still representing "the driver".
> >
> > I could use regular file I/O as well. I would either have to devise a
> > protocol for talking to the driver, have a way of delimiting messages,
> > re-syncing the channel etc., or make a slight semantic shift and
> > instead of the file object representing the driver, it would represent
> > this one particular function of the driver. Opening the file and
> > reading from it until eof would yield the crash dump header.
> I think that this is not as good as can be for the whole design of the
> feature.
> The problem here is that userspace starts toooooooooo late and is not
> accessible when the guest BSODS and we do need a dump for analysis.
> 
> May be it worth to push this header to QEMU at boot time through virtio bus?

Yes, the lateness of userspace startup is the same objection I have to
use of the guest agent in a similar role for dumping of info for Linux
guests with KASLR.

  http://lists.nongnu.org/archive/html/qemu-devel/2016-11/msg01618.html

That thread explored options like virtio-pstore, or UEFI pstore or
ACPI pstore as approaches  for pushing the info the host during
very early guest start.

Definitely feels like there's scope for considering these needs in
a common framework. From a libvirt POV we would really very strongly
prefer to have a guest dump mechanism that has a common architectural
approach regardless of guest OS, even if the actual info sent over
that architecture was different.

Regards,
Daniel
Ladi Prosek Feb. 6, 2017, 3:50 p.m. UTC | #11
On Mon, Feb 6, 2017 at 4:37 PM, Denis V. Lunev <den@openvz.org> wrote:
> On 02/01/2017 04:41 PM, Ladi Prosek wrote:
>> On Wed, Feb 1, 2017 at 12:03 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
>>> On Wed, Feb 01, 2017 at 11:50:43AM +0100, Ladi Prosek wrote:
>>>> On Wed, Feb 1, 2017 at 11:20 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
>>>>> On Wed, Feb 01, 2017 at 11:06:46AM +0100, Ladi Prosek wrote:
>>>>>> Analogous to guest-file-read and guest-file-write, this commit adds
>>>>>> support for issuing IOCTLs to files in the guest. With the goal of
>>>>>> abstracting away the differences between Posix ioctl() and Win32
>>>>>> DeviceIoControl() to provide one unified API, the schema distinguishes
>>>>>> between input and output buffer sizes (as required by Win32) and
>>>>>> allows the caller to supply either a 'buffer', pointer to which will
>>>>>> be passed to the Posix ioctl(), or an integer argument, passed to
>>>>>> ioctl() directly.
>>>>> What is the intended usage scenario for this ?
>>>> My specific case is extracting a piece of data from Windows guests.
>>>> Guest driver exposes a file object with a well-known IOCTL code to
>>>> return a data structure from the kernel.
>>> Please provide more information about what you're trying to do.
>>>
>>> If we can understand the full details it might suggest a different
>>> approach that exposing a generic ioctl passthrough facility.
>> The end goal is to be able to create a Window crash dump file from a
>> running (or crashed, but running is more interesting because Windows
>> can't do that by itself) Windows VM. To do that without resorting to
>> hacks, the host application driving this needs to get the crash dump
>> header, which Windows provides in its KeInitializeCrashDumpHeader
>> kernel API.
>>
>> I believe that the most natural way to do this is to have
>> 1) a driver installed in the guest providing a way to call
>> KeInitializeCrashDumpHeader from user space
>> 2) an agent in the guest, running in user space, calling the driver
>> and passing the result back to the host
>>
>> Now 2) may as well be an existing agent, such as the QEMU guest agent,
>> and that's why I am here :)
>>
>> KeInitializeCrashDumpHeader returns an opaque byte array, happens to
>> be 8192 bytes at the moment. My first choice for the kernel-user
>> interface in the guest is IOCTL because what I'm trying to get across
>> is a block, a "datagram", not a stream, and I have the option for
>> easily adding more functionality later by adding more IOCTL codes with
>> the file object still representing "the driver".
>>
>> I could use regular file I/O as well. I would either have to devise a
>> protocol for talking to the driver, have a way of delimiting messages,
>> re-syncing the channel etc., or make a slight semantic shift and
>> instead of the file object representing the driver, it would represent
>> this one particular function of the driver. Opening the file and
>> reading from it until eof would yield the crash dump header.
> I think that this is not as good as can be for the whole design of the
> feature.
> The problem here is that userspace starts toooooooooo late and is not
> accessible when the guest BSODS and we do need a dump for analysis.
>
> May be it worth to push this header to QEMU at boot time through virtio bus?

Yes, definitely an option. I believe that having the ability to create
a dump out of a live system, i.e. without crashing it, is what adds
the most value here. And that would most likely happen on an
up-and-running guest so the difference between being able to do it
after a kernel driver loads and after a user space service starts is
not that significant.

Still, the sooner the better, for sure. I think I've seen
virtio-pstore suggested as a possible channel to use to push the
header.

Thanks!

> Den
Alexey Kostyushko Feb. 6, 2017, 4:10 p.m. UTC | #12
This info is more about pvpanic patch, but may be relevant here also.

Consider that KeInitializeCrashDumpHeader is not enough to create dumps because it copy only primary header

and doesn't copy KdDebuggerBlock. Look to combine info from KeCapturePersistentThreadState.

Also whenever amount of RAM is changed (memory hot-plug) you need to recreate headers.

I think that there should be virtio direct channel for general purposes of dumping and guest-agent channel may be usefull for obtainin varoius

info from guest drivers through ioctls and providing live-dump feature.
Ladi Prosek Feb. 7, 2017, 1:16 p.m. UTC | #13
On Mon, Feb 6, 2017 at 5:10 PM, Alexey Kostyushko <aleksko@virtuozzo.com> wrote:
> This info is more about pvpanic patch, but may be relevant here also.
>
> Consider that KeInitializeCrashDumpHeader is not enough to create dumps
> because it copy only primary header

You mean that the secondary data area is missing? That's by design -
again, the OS is running, we're not really crashing it so
BugCheckSecondaryDumpDataCallback's are not invoked.

For the actual "crash" crash dumps, we can experiment with
BugCheckDumpIoCallback which would let us push out the header as well
as secondary data but that's a non-goal at the moment.

> and doesn't copy KdDebuggerBlock.

It contains the pointers from KdDebuggerBlock: PsLoadedModuleList,
PsActiveProcessHead, and more. In my testing, the dump created from
KeInitializeCrashDumpHeader and the raw memory image was fully
functional. If there's anything specific you're concerned about, I can
check it.

> Look to combine info from KeCapturePersistentThreadState.

This looks like an undocumented API and I'd like to avoid those if possible.

> Also whenever amount of RAM is changed (memory hot-plug) you need to
> recreate headers.

That would be a maybe. We could just patch the memory regions in the
header to reflect the current memory layout. And there's a chance we
may need to do it anyways - the physical memory layout that QEMU works
with doesn't match what Windows sees for some reason. That's one thing
I need to take a close look at. Thanks!

> I think that there should be virtio direct channel for general purposes of
> dumping and guest-agent channel may be usefull for obtainin varoius
>
> info from guest drivers through ioctls and providing live-dump feature.
>
> ________________________________
> From: Ladi Prosek <lprosek@redhat.com>
> Sent: Monday, February 6, 2017 6:50:43 PM
> To: Denis Lunev
> Cc: Daniel P. Berrange; qemu-devel; Michael Roth; Alexey Kostyushko
> Subject: Re: [Qemu-devel] [PATCH] qga: implement guest-file-ioctl
>
> On Mon, Feb 6, 2017 at 4:37 PM, Denis V. Lunev <den@openvz.org> wrote:
>> On 02/01/2017 04:41 PM, Ladi Prosek wrote:
>>> On Wed, Feb 1, 2017 at 12:03 PM, Daniel P. Berrange <berrange@redhat.com>
>>> wrote:
>>>> On Wed, Feb 01, 2017 at 11:50:43AM +0100, Ladi Prosek wrote:
>>>>> On Wed, Feb 1, 2017 at 11:20 AM, Daniel P. Berrange
>>>>> <berrange@redhat.com> wrote:
>>>>>> On Wed, Feb 01, 2017 at 11:06:46AM +0100, Ladi Prosek wrote:
>>>>>>> Analogous to guest-file-read and guest-file-write, this commit adds
>>>>>>> support for issuing IOCTLs to files in the guest. With the goal of
>>>>>>> abstracting away the differences between Posix ioctl() and Win32
>>>>>>> DeviceIoControl() to provide one unified API, the schema
>>>>>>> distinguishes
>>>>>>> between input and output buffer sizes (as required by Win32) and
>>>>>>> allows the caller to supply either a 'buffer', pointer to which will
>>>>>>> be passed to the Posix ioctl(), or an integer argument, passed to
>>>>>>> ioctl() directly.
>>>>>> What is the intended usage scenario for this ?
>>>>> My specific case is extracting a piece of data from Windows guests.
>>>>> Guest driver exposes a file object with a well-known IOCTL code to
>>>>> return a data structure from the kernel.
>>>> Please provide more information about what you're trying to do.
>>>>
>>>> If we can understand the full details it might suggest a different
>>>> approach that exposing a generic ioctl passthrough facility.
>>> The end goal is to be able to create a Window crash dump file from a
>>> running (or crashed, but running is more interesting because Windows
>>> can't do that by itself) Windows VM. To do that without resorting to
>>> hacks, the host application driving this needs to get the crash dump
>>> header, which Windows provides in its KeInitializeCrashDumpHeader
>>> kernel API.
>>>
>>> I believe that the most natural way to do this is to have
>>> 1) a driver installed in the guest providing a way to call
>>> KeInitializeCrashDumpHeader from user space
>>> 2) an agent in the guest, running in user space, calling the driver
>>> and passing the result back to the host
>>>
>>> Now 2) may as well be an existing agent, such as the QEMU guest agent,
>>> and that's why I am here :)
>>>
>>> KeInitializeCrashDumpHeader returns an opaque byte array, happens to
>>> be 8192 bytes at the moment. My first choice for the kernel-user
>>> interface in the guest is IOCTL because what I'm trying to get across
>>> is a block, a "datagram", not a stream, and I have the option for
>>> easily adding more functionality later by adding more IOCTL codes with
>>> the file object still representing "the driver".
>>>
>>> I could use regular file I/O as well. I would either have to devise a
>>> protocol for talking to the driver, have a way of delimiting messages,
>>> re-syncing the channel etc., or make a slight semantic shift and
>>> instead of the file object representing the driver, it would represent
>>> this one particular function of the driver. Opening the file and
>>> reading from it until eof would yield the crash dump header.
>> I think that this is not as good as can be for the whole design of the
>> feature.
>> The problem here is that userspace starts toooooooooo late and is not
>> accessible when the guest BSODS and we do need a dump for analysis.
>>
>> May be it worth to push this header to QEMU at boot time through virtio
>> bus?
>
> Yes, definitely an option. I believe that having the ability to create
> a dump out of a live system, i.e. without crashing it, is what adds
> the most value here. And that would most likely happen on an
> up-and-running guest so the difference between being able to do it
> after a kernel driver loads and after a user space service starts is
> not that significant.
>
> Still, the sooner the better, for sure. I think I've seen
> virtio-pstore suggested as a possible channel to use to push the
> header.
>
> Thanks!
>
>> Den
diff mbox

Patch

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index ea37c09..4fb6edc 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -547,6 +547,84 @@  GuestFileWrite *qmp_guest_file_write(int64_t handle, const char *buf_b64,
     return write_data;
 }
 
+GuestFileIOCTL *qmp_guest_file_ioctl(int64_t handle, int64_t code,
+                                     int64_t out_count, const char *in_buf_b64,
+                                     bool has_in_count, int64_t in_count,
+                                     bool has_int_arg, int64_t int_arg,
+                                     Error **errp)
+{
+    GuestFileIOCTL *ioctl_data = NULL;
+    guchar *buf;
+    gsize buf_len;
+    int ioctl_retval;
+    GuestFileHandle *gfh = guest_file_handle_find(handle, errp);
+    int fd;
+
+    if (!gfh) {
+        return NULL;
+    }
+    fd = fileno(gfh->fh);
+    if (fd < 0) {
+        error_setg_errno(errp, errno, "failed to get fd");
+        return NULL;
+    }
+
+    buf = qbase64_decode(in_buf_b64, -1, &buf_len, errp);
+    if (!buf) {
+        return NULL;
+    }
+    if (has_int_arg && buf_len) {
+        error_setg(errp,
+            "int-arg and non-empty in-buf-b64 must not be specified at the same time");
+        g_free(buf);
+        return NULL;
+    }
+    if (has_int_arg && out_count) {
+        error_setg(errp,
+            "int-arg and non-zero out-count must not be specified at the same time");
+        g_free(buf);
+        return NULL;
+    }
+
+    if (!has_in_count) {
+        in_count = buf_len;
+    } else if (in_count < 0 || in_count > buf_len) {
+        error_setg(errp, "value '%" PRId64 "' is invalid for argument in-count",
+                   in_count);
+        g_free(buf);
+        return NULL;
+    }
+
+    /* there's only one input/output buffer so make sure it's large enough */
+    if (out_count > buf_len) {
+        guchar *old_buf = buf;
+        buf = g_malloc(out_count);
+
+        memcpy(buf, old_buf, buf_len);
+        memset(buf + buf_len, 0, out_count - buf_len);
+        g_free(old_buf);
+    }
+
+    if (has_int_arg) {
+        ioctl_retval = ioctl(fd, code, int_arg);
+    } else {
+        ioctl_retval = ioctl(fd, code, buf);
+    }
+
+    if (ioctl_retval < 0) {
+        error_setg_errno(errp, errno, "failed to issue IOCTL to file");
+        slog("guest-file-ioctl failed, handle: %" PRId64, handle);
+    } else {
+        ioctl_data = g_new0(GuestFileIOCTL, 1);
+        ioctl_data->retcode = ioctl_retval;
+        ioctl_data->count = out_count;
+        ioctl_data->buf_b64 = g_base64_encode(buf, out_count);
+    }
+    g_free(buf);
+
+    return ioctl_data;
+}
+
 struct GuestFileSeek *qmp_guest_file_seek(int64_t handle, int64_t offset,
                                           GuestFileWhence *whence_code,
                                           Error **errp)
diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index 19d72b2..7d1ad35 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -381,6 +381,72 @@  done:
     return write_data;
 }
 
+GuestFileIOCTL *qmp_guest_file_ioctl(int64_t handle, int64_t code,
+                                     int64_t out_count, const char *in_buf_b64,
+                                     bool has_in_count, int64_t in_count,
+                                     bool has_int_arg, int64_t int_arg,
+                                     Error **errp)
+{
+    GuestFileIOCTL *ioctl_data = NULL;
+    guchar *in_buf = NULL, *out_buf = NULL;
+    gsize in_buf_len;
+    BOOL is_ok;
+    GuestFileHandle *gfh = guest_file_handle_find(handle, errp);
+    DWORD bytes_returned;
+    HANDLE fh;
+
+    if (!gfh) {
+        return NULL;
+    }
+    fh = gfh->fh;
+    in_buf = qbase64_decode(in_buf_b64, -1, &in_buf_len, errp);
+    if (!in_buf) {
+        return NULL;
+    }
+    if (has_int_arg) {
+        error_setg(errp, "integer arguments are not supported");
+        goto done;
+    }
+
+    if (!has_in_count) {
+        in_count = in_buf_len;
+    } else if (in_count < 0 || in_count > in_buf_len) {
+        error_setg(errp, "value '%" PRId64
+                   "' is invalid for argument in-count", in_count);
+        goto done;
+    }
+
+    if (out_count < 0) {
+        error_setg(errp, "value '%" PRId64
+                   "' is invalid for argument out-count", out_count);
+        goto done;
+    }
+    if (out_count > 0) {
+        out_buf = g_malloc(out_count);
+    }
+
+    is_ok = DeviceIoControl(fh, code, in_buf, in_count, out_buf, out_count,
+                            &bytes_returned, NULL);
+    if (!is_ok) {
+        error_setg_win32(errp, GetLastError(), "failed to issue IOCTL to file");
+        slog("guest-file-ioctl-failed, handle: %" PRId64, handle);
+    } else {
+        ioctl_data = g_new0(GuestFileIOCTL, 1);
+        ioctl_data->retcode = is_ok;
+        ioctl_data->count = bytes_returned;
+        if (out_buf) {
+            ioctl_data->buf_b64 = g_base64_encode(out_buf, bytes_returned);
+        } else {
+            ioctl_data->buf_b64 = g_strdup("");
+        }
+    }
+
+done:
+    g_free(in_buf);
+    g_free(out_buf);
+    return ioctl_data;
+}
+
 GuestFileSeek *qmp_guest_file_seek(int64_t handle, int64_t offset,
                                    GuestFileWhence *whence_code,
                                    Error **errp)
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index d421609..efef6d9 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -298,6 +298,51 @@ 
   'data':    { 'handle': 'int', 'buf-b64': 'str', '*count': 'int' },
   'returns': 'GuestFileWrite' }
 
+##
+# @GuestFileIOCTL:
+#
+# Result of guest agent file-ioctl operation
+#
+# @retcode: return value of the IOCTL call
+#
+# @count: number of output bytes (note: count is *before*
+#         base64-encoding is applied)
+#
+# @buf-b64: base64-encoded output bytes
+#
+# Since: 2.9
+##
+{ 'struct': 'GuestFileIOCTL',
+  'data': { 'retcode': 'int', 'count': 'int', 'buf-b64': 'str' } }
+
+##
+# @guest-file-ioctl:
+#
+# Issue an IOCTL to an open file in the guest.
+#
+# @handle: filehandle returned by guest-file-open
+#
+# @code: device-dependent request code
+#
+# @out-count: output bytes expected to be returned by the IOCTL
+#
+# @in-buf-b64: base64-encoded string representing input data
+#
+# @in-count: #optional input bytes (actual bytes, after base64-decode),
+#            default is all content in in-buf-b64 buffer after base64 decoding
+#
+# @int-arg: #optional integer input argument to be used instead of buf-b64,
+#           it is an error to pass both a non-empty in-buf-b64 and int-arg
+#           and to pass both a non-zero out-count and int-arg
+#
+# Returns: @GuestFileIOCTL on success.
+#
+# Since: 2.9
+##
+{ 'command': 'guest-file-ioctl',
+  'data':    { 'handle': 'int', 'code': 'int', 'out-count': 'int',
+               'in-buf-b64': 'str', '*in-count': 'int', '*int-arg': 'int' },
+  'returns': 'GuestFileIOCTL' }
 
 ##
 # @GuestFileSeek: