diff mbox series

hw/9pfs: virtio-9p: Ensure config space is a multiple of 4 bytes

Message ID 1603959941-9689-1-git-send-email-bmeng.cn@gmail.com (mailing list archive)
State New, archived
Headers show
Series hw/9pfs: virtio-9p: Ensure config space is a multiple of 4 bytes | expand

Commit Message

Bin Meng Oct. 29, 2020, 8:25 a.m. UTC
From: Bin Meng <bin.meng@windriver.com>

At present the virtio device config space access is handled by the
virtio_config_readX() and virtio_config_writeX() APIs. They perform
a sanity check on the result of address plus size against the config
space size before the access occurs.

For unaligned access, the last converted naturally aligned access
will fail the sanity check on 9pfs. For example, with a mount_tag
`p9fs`, if guest software tries to read the mount_tag via a 4 byte
read at the mount_tag offset which is not 4 byte aligned, the read
result will be `p9\377\377`, which is wrong.

This changes the size of device config space to be a multiple of 4
bytes so that correct result can be returned in all circumstances.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
---

 hw/9pfs/virtio-9p-device.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Christian Schoenebeck Oct. 29, 2020, 12:52 p.m. UTC | #1
On Donnerstag, 29. Oktober 2020 09:25:41 CET Bin Meng wrote:
> From: Bin Meng <bin.meng@windriver.com>
> 
> At present the virtio device config space access is handled by the
> virtio_config_readX() and virtio_config_writeX() APIs. They perform
> a sanity check on the result of address plus size against the config
> space size before the access occurs.

Since I am not very familiar with the virtio implementation side, I hope
Michael would have a look at this patch.

But some comments from my side ...

> 
> For unaligned access, the last converted naturally aligned access
> will fail the sanity check on 9pfs. For example, with a mount_tag
> `p9fs`, if guest software tries to read the mount_tag via a 4 byte
> read at the mount_tag offset which is not 4 byte aligned, the read
> result will be `p9\377\377`, which is wrong.

Why 4? Shouldn't this rather consider worst case alignment?

> 
> This changes the size of device config space to be a multiple of 4
> bytes so that correct result can be returned in all circumstances.
> 
> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> ---
> 
>  hw/9pfs/virtio-9p-device.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> index 14371a7..e6a1432 100644
> --- a/hw/9pfs/virtio-9p-device.c
> +++ b/hw/9pfs/virtio-9p-device.c
> @@ -201,6 +201,7 @@ static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
>      V9fsVirtioState *v = VIRTIO_9P(dev);
>      V9fsState *s = &v->state;
>      FsDriverEntry *fse = get_fsdev_fsentry(s->fsconf.fsdev_id);
> +    size_t config_size;
>  
>      if (qtest_enabled() && fse) {
>          fse->export_flags |= V9FS_NO_PERF_WARN;
> @@ -211,7 +212,8 @@ static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
>      }
>  
>      v->config_size = sizeof(struct virtio_9p_config) + strlen(s->fsconf.tag);
> -    virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size);
> +    config_size = ROUND_UP(v->config_size, 4);
> +    virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, config_size);
>      v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
>  }

Shouldn't this config_size correction rather be handled on virtio.c side
instead, i.e. in virtio_init()?

>  
> -- 
> 2.7.4

Best regards,
Christian Schoenebeck
Bin Meng Oct. 29, 2020, 1:19 p.m. UTC | #2
Hi Christian,

On Thu, Oct 29, 2020 at 8:52 PM Christian Schoenebeck
<qemu_oss@crudebyte.com> wrote:
>
> On Donnerstag, 29. Oktober 2020 09:25:41 CET Bin Meng wrote:
> > From: Bin Meng <bin.meng@windriver.com>
> >
> > At present the virtio device config space access is handled by the
> > virtio_config_readX() and virtio_config_writeX() APIs. They perform
> > a sanity check on the result of address plus size against the config
> > space size before the access occurs.
>
> Since I am not very familiar with the virtio implementation side, I hope
> Michael would have a look at this patch.
>
> But some comments from my side ...

Thanks for the review.

>
> >
> > For unaligned access, the last converted naturally aligned access
> > will fail the sanity check on 9pfs. For example, with a mount_tag
> > `p9fs`, if guest software tries to read the mount_tag via a 4 byte
> > read at the mount_tag offset which is not 4 byte aligned, the read
> > result will be `p9\377\377`, which is wrong.
>
> Why 4? Shouldn't this rather consider worst case alignment?
>

Both pci and mmio transports only support 1/2/4 bytes access
granularity in the config space, hence the worst case alignment is 4.

> >
> > This changes the size of device config space to be a multiple of 4
> > bytes so that correct result can be returned in all circumstances.
> >
> > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > ---
> >
> >  hw/9pfs/virtio-9p-device.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> > index 14371a7..e6a1432 100644
> > --- a/hw/9pfs/virtio-9p-device.c
> > +++ b/hw/9pfs/virtio-9p-device.c
> > @@ -201,6 +201,7 @@ static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
> >      V9fsVirtioState *v = VIRTIO_9P(dev);
> >      V9fsState *s = &v->state;
> >      FsDriverEntry *fse = get_fsdev_fsentry(s->fsconf.fsdev_id);
> > +    size_t config_size;
> >
> >      if (qtest_enabled() && fse) {
> >          fse->export_flags |= V9FS_NO_PERF_WARN;
> > @@ -211,7 +212,8 @@ static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
> >      }
> >
> >      v->config_size = sizeof(struct virtio_9p_config) + strlen(s->fsconf.tag);
> > -    virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size);
> > +    config_size = ROUND_UP(v->config_size, 4);
> > +    virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, config_size);
> >      v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
> >  }
>
> Shouldn't this config_size correction rather be handled on virtio.c side
> instead, i.e. in virtio_init()?

I checked other existing virtio devices, and their config space sizes
seem to be already multiple of 4 bytes. If we fix it in virtio_init()
that sounds to be future-proof. Michael?

Regards,
Bin
Michael S. Tsirkin Oct. 30, 2020, 9:29 a.m. UTC | #3
On Thu, Oct 29, 2020 at 04:25:41PM +0800, Bin Meng wrote:
> From: Bin Meng <bin.meng@windriver.com>
> 
> At present the virtio device config space access is handled by the
> virtio_config_readX() and virtio_config_writeX() APIs. They perform
> a sanity check on the result of address plus size against the config
> space size before the access occurs.
> 
> For unaligned access, the last converted naturally aligned access
> will fail the sanity check on 9pfs. For example, with a mount_tag
> `p9fs`, if guest software tries to read the mount_tag via a 4 byte
> read at the mount_tag offset which is not 4 byte aligned, the read
> result will be `p9\377\377`, which is wrong.
> 
> This changes the size of device config space to be a multiple of 4
> bytes so that correct result can be returned in all circumstances.
> 
> Signed-off-by: Bin Meng <bin.meng@windriver.com>



The patch is ok, but I'd like to clarify the commit log.

If I understand correctly, what happens is:
- tag is set to a value that is not a multiple of 4 bytes
- guest attempts to read the last 4 bytes of the tag
- access returns -1


What I find confusing in the above description:
- reference to unaligned access - I don't think these
  are legal or allowed by QEMU
- reference to `p9\377\377` - I think returned value will be -1

thanks!

> ---
> 
>  hw/9pfs/virtio-9p-device.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
> index 14371a7..e6a1432 100644
> --- a/hw/9pfs/virtio-9p-device.c
> +++ b/hw/9pfs/virtio-9p-device.c
> @@ -201,6 +201,7 @@ static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
>      V9fsVirtioState *v = VIRTIO_9P(dev);
>      V9fsState *s = &v->state;
>      FsDriverEntry *fse = get_fsdev_fsentry(s->fsconf.fsdev_id);
> +    size_t config_size;
>  
>      if (qtest_enabled() && fse) {
>          fse->export_flags |= V9FS_NO_PERF_WARN;
> @@ -211,7 +212,8 @@ static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
>      }
>  
>      v->config_size = sizeof(struct virtio_9p_config) + strlen(s->fsconf.tag);
> -    virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size);
> +    config_size = ROUND_UP(v->config_size, 4);
> +    virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, config_size);
>      v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
>  }
>  
> -- 
> 2.7.4
Bin Meng Nov. 3, 2020, 6:26 a.m. UTC | #4
Hi Michael,

On Fri, Oct 30, 2020 at 5:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Oct 29, 2020 at 04:25:41PM +0800, Bin Meng wrote:
> > From: Bin Meng <bin.meng@windriver.com>
> >
> > At present the virtio device config space access is handled by the
> > virtio_config_readX() and virtio_config_writeX() APIs. They perform
> > a sanity check on the result of address plus size against the config
> > space size before the access occurs.
> >
> > For unaligned access, the last converted naturally aligned access
> > will fail the sanity check on 9pfs. For example, with a mount_tag
> > `p9fs`, if guest software tries to read the mount_tag via a 4 byte
> > read at the mount_tag offset which is not 4 byte aligned, the read
> > result will be `p9\377\377`, which is wrong.
> >
> > This changes the size of device config space to be a multiple of 4
> > bytes so that correct result can be returned in all circumstances.
> >
> > Signed-off-by: Bin Meng <bin.meng@windriver.com>
>
>
>
> The patch is ok, but I'd like to clarify the commit log.

Thanks for the review.

>
> If I understand correctly, what happens is:
> - tag is set to a value that is not a multiple of 4 bytes

It's not about the mount_tag value, but the length of the mount_tag is 4.

> - guest attempts to read the last 4 bytes of the tag

Yep. So the config space of a 9pfs looks like the following:

offset: 0x14, size: 2 bytes indicating the length of the following mount_tag
offset: 0x16, size: value of (offset 0x14).

When a 4-byte mount_tag is given, guest software is subject to read 4
bytes (value read from offset 0x14) at offset 0x16.

> - access returns -1
>

The access will be split into 2 accesses, either by hardware or
software. On RISC-V such unaligned access is emulated by M-mode
firmware. On ARM I believe it's supported by the CPU. So the first
converted aligned access is to read 4 byte at 0x14 and the second
converted aligned access is to read 4 byte at 0x16, and drop the bytes
that are not needed, assemble the remaining bytes and return the
result to the guest software. The second aligned access will fail the
sanity check and return -1, but not the first access, hence the result
will be `p9\377\377`.

>
> What I find confusing in the above description:
> - reference to unaligned access - I don't think these
>   are legal or allowed by QEMU
> - reference to `p9\377\377` - I think returned value will be -1
>

Regards,
Bin
Bin Meng Nov. 3, 2020, 6:30 a.m. UTC | #5
On Tue, Nov 3, 2020 at 2:26 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> Hi Michael,
>
> On Fri, Oct 30, 2020 at 5:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Oct 29, 2020 at 04:25:41PM +0800, Bin Meng wrote:
> > > From: Bin Meng <bin.meng@windriver.com>
> > >
> > > At present the virtio device config space access is handled by the
> > > virtio_config_readX() and virtio_config_writeX() APIs. They perform
> > > a sanity check on the result of address plus size against the config
> > > space size before the access occurs.
> > >
> > > For unaligned access, the last converted naturally aligned access
> > > will fail the sanity check on 9pfs. For example, with a mount_tag
> > > `p9fs`, if guest software tries to read the mount_tag via a 4 byte
> > > read at the mount_tag offset which is not 4 byte aligned, the read
> > > result will be `p9\377\377`, which is wrong.
> > >
> > > This changes the size of device config space to be a multiple of 4
> > > bytes so that correct result can be returned in all circumstances.
> > >
> > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >
> >
> >
> > The patch is ok, but I'd like to clarify the commit log.
>
> Thanks for the review.
>
> >
> > If I understand correctly, what happens is:
> > - tag is set to a value that is not a multiple of 4 bytes
>
> It's not about the mount_tag value, but the length of the mount_tag is 4.
>
> > - guest attempts to read the last 4 bytes of the tag
>
> Yep. So the config space of a 9pfs looks like the following:
>
> offset: 0x14, size: 2 bytes indicating the length of the following mount_tag
> offset: 0x16, size: value of (offset 0x14).
>
> When a 4-byte mount_tag is given, guest software is subject to read 4
> bytes (value read from offset 0x14) at offset 0x16.
>
> > - access returns -1
> >
>
> The access will be split into 2 accesses, either by hardware or
> software. On RISC-V such unaligned access is emulated by M-mode
> firmware. On ARM I believe it's supported by the CPU. So the first
> converted aligned access is to read 4 byte at 0x14 and the second
> converted aligned access is to read 4 byte at 0x16, and drop the bytes

Oops, typo. The 2nd access happens at offset 0x18

> that are not needed, assemble the remaining bytes and return the
> result to the guest software. The second aligned access will fail the
> sanity check and return -1, but not the first access, hence the result
> will be `p9\377\377`.
>
> >
> > What I find confusing in the above description:
> > - reference to unaligned access - I don't think these
> >   are legal or allowed by QEMU
> > - reference to `p9\377\377` - I think returned value will be -1
> >

Regards,
Bin
Michael S. Tsirkin Nov. 3, 2020, 12:05 p.m. UTC | #6
On Tue, Nov 03, 2020 at 02:26:10PM +0800, Bin Meng wrote:
> Hi Michael,
> 
> On Fri, Oct 30, 2020 at 5:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Oct 29, 2020 at 04:25:41PM +0800, Bin Meng wrote:
> > > From: Bin Meng <bin.meng@windriver.com>
> > >
> > > At present the virtio device config space access is handled by the
> > > virtio_config_readX() and virtio_config_writeX() APIs. They perform
> > > a sanity check on the result of address plus size against the config
> > > space size before the access occurs.
> > >
> > > For unaligned access, the last converted naturally aligned access
> > > will fail the sanity check on 9pfs. For example, with a mount_tag
> > > `p9fs`, if guest software tries to read the mount_tag via a 4 byte
> > > read at the mount_tag offset which is not 4 byte aligned, the read
> > > result will be `p9\377\377`, which is wrong.
> > >
> > > This changes the size of device config space to be a multiple of 4
> > > bytes so that correct result can be returned in all circumstances.
> > >
> > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >
> >
> >
> > The patch is ok, but I'd like to clarify the commit log.
> 
> Thanks for the review.
> 
> >
> > If I understand correctly, what happens is:
> > - tag is set to a value that is not a multiple of 4 bytes
> 
> It's not about the mount_tag value, but the length of the mount_tag is 4.
> 
> > - guest attempts to read the last 4 bytes of the tag
> 
> Yep. So the config space of a 9pfs looks like the following:
> 
> offset: 0x14, size: 2 bytes indicating the length of the following mount_tag
> offset: 0x16, size: value of (offset 0x14).
> 
> When a 4-byte mount_tag is given, guest software is subject to read 4
> bytes (value read from offset 0x14) at offset 0x16.


Well looking at Linux guest code:


static inline void __virtio_cread_many(struct virtio_device *vdev,
                                       unsigned int offset,
                                       void *buf, size_t count, size_t bytes)
{
        u32 old, gen = vdev->config->generation ?
                vdev->config->generation(vdev) : 0;
        int i;  
                                   
        might_sleep();             
        do {
                old = gen;

                for (i = 0; i < count; i++)
                        vdev->config->get(vdev, offset + bytes * i,
                                          buf + i * bytes, bytes);
        
                gen = vdev->config->generation ?
                        vdev->config->generation(vdev) : 0;
        } while (gen != old);
}
        


static inline void virtio_cread_bytes(struct virtio_device *vdev,
                                      unsigned int offset,
                                      void *buf, size_t len) 
{           
        __virtio_cread_many(vdev, offset, buf, len, 1);
}

and:


        virtio_cread_bytes(vdev, offsetof(struct virtio_9p_config, tag),
                           tag, tag_len);



So guest is doing multiple 1-byte reads.


Spec actually says:
	For device configuration access, the driver MUST use 8-bit wide accesses for 8-bit wide fields, 16-bit wide

	and aligned accesses for 16-bit wide fields and 32-bit wide and aligned accesses for 32-bit and 64-bit wide

	fields. For 64-bit fields, the driver MAY access each of the high and low 32-bit parts of the field independently.

9p was never standardized, but the linux header at least lists it as
follows:

struct virtio_9p_config {
        /* length of the tag name */
        __virtio16 tag_len;
        /* non-NULL terminated tag name */
        __u8 tag[0];
} __attribute__((packed));

In that sense tag is an 8 byte field.

So which guest reads tag using a 32 bit read, and why?



> > - access returns -1
> >
> 
> The access will be split into 2 accesses, either by hardware or
> software. On RISC-V such unaligned access is emulated by M-mode
> firmware. On ARM I believe it's supported by the CPU. So the first
> converted aligned access is to read 4 byte at 0x14 and the second
> converted aligned access is to read 4 byte at 0x16, and drop the bytes
> that are not needed, assemble the remaining bytes and return the
> result to the guest software. The second aligned access will fail the
> sanity check and return -1, but not the first access, hence the result
> will be `p9\377\377`.
> 
> >
> > What I find confusing in the above description:
> > - reference to unaligned access - I don't think these
> >   are legal or allowed by QEMU
> > - reference to `p9\377\377` - I think returned value will be -1
> >
> 
> Regards,
> Bin
Bin Meng Nov. 4, 2020, 7:44 a.m. UTC | #7
Hi Michael,

On Tue, Nov 3, 2020 at 8:05 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Nov 03, 2020 at 02:26:10PM +0800, Bin Meng wrote:
> > Hi Michael,
> >
> > On Fri, Oct 30, 2020 at 5:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Thu, Oct 29, 2020 at 04:25:41PM +0800, Bin Meng wrote:
> > > > From: Bin Meng <bin.meng@windriver.com>
> > > >
> > > > At present the virtio device config space access is handled by the
> > > > virtio_config_readX() and virtio_config_writeX() APIs. They perform
> > > > a sanity check on the result of address plus size against the config
> > > > space size before the access occurs.
> > > >
> > > > For unaligned access, the last converted naturally aligned access
> > > > will fail the sanity check on 9pfs. For example, with a mount_tag
> > > > `p9fs`, if guest software tries to read the mount_tag via a 4 byte
> > > > read at the mount_tag offset which is not 4 byte aligned, the read
> > > > result will be `p9\377\377`, which is wrong.
> > > >
> > > > This changes the size of device config space to be a multiple of 4
> > > > bytes so that correct result can be returned in all circumstances.
> > > >
> > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > >
> > >
> > >
> > > The patch is ok, but I'd like to clarify the commit log.
> >
> > Thanks for the review.
> >
> > >
> > > If I understand correctly, what happens is:
> > > - tag is set to a value that is not a multiple of 4 bytes
> >
> > It's not about the mount_tag value, but the length of the mount_tag is 4.
> >
> > > - guest attempts to read the last 4 bytes of the tag
> >
> > Yep. So the config space of a 9pfs looks like the following:
> >
> > offset: 0x14, size: 2 bytes indicating the length of the following mount_tag
> > offset: 0x16, size: value of (offset 0x14).
> >
> > When a 4-byte mount_tag is given, guest software is subject to read 4
> > bytes (value read from offset 0x14) at offset 0x16.
>
>
> Well looking at Linux guest code:
>
>
> static inline void __virtio_cread_many(struct virtio_device *vdev,
>                                        unsigned int offset,
>                                        void *buf, size_t count, size_t bytes)
> {
>         u32 old, gen = vdev->config->generation ?
>                 vdev->config->generation(vdev) : 0;
>         int i;
>
>         might_sleep();
>         do {
>                 old = gen;
>
>                 for (i = 0; i < count; i++)
>                         vdev->config->get(vdev, offset + bytes * i,
>                                           buf + i * bytes, bytes);
>
>                 gen = vdev->config->generation ?
>                         vdev->config->generation(vdev) : 0;
>         } while (gen != old);
> }
>
>
>
> static inline void virtio_cread_bytes(struct virtio_device *vdev,
>                                       unsigned int offset,
>                                       void *buf, size_t len)
> {
>         __virtio_cread_many(vdev, offset, buf, len, 1);
> }
>
> and:
>
>
>         virtio_cread_bytes(vdev, offsetof(struct virtio_9p_config, tag),
>                            tag, tag_len);
>
>
>
> So guest is doing multiple 1-byte reads.
>

Correct.

>
> Spec actually says:
>         For device configuration access, the driver MUST use 8-bit wide accesses for 8-bit wide fields, 16-bit wide
>
>         and aligned accesses for 16-bit wide fields and 32-bit wide and aligned accesses for 32-bit and 64-bit wide
>
>         fields. For 64-bit fields, the driver MAY access each of the high and low 32-bit parts of the field independently.
>

Yes.

> 9p was never standardized, but the linux header at least lists it as
> follows:
>
> struct virtio_9p_config {
>         /* length of the tag name */
>         __virtio16 tag_len;
>         /* non-NULL terminated tag name */
>         __u8 tag[0];
> } __attribute__((packed));
>
> In that sense tag is an 8 byte field.
>
> So which guest reads tag using a 32 bit read, and why?
>

The obvious fix can be made to the guest which exposed this issue, but
I was wondering whether we should enforce all virtio devices' config
space size to be a multiple of 4 bytes which sounds more natural.

Regards,
Bin
Christian Schoenebeck Nov. 4, 2020, 10:57 a.m. UTC | #8
On Mittwoch, 4. November 2020 08:44:44 CET Bin Meng wrote:
> Hi Michael,
> 
> On Tue, Nov 3, 2020 at 8:05 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Tue, Nov 03, 2020 at 02:26:10PM +0800, Bin Meng wrote:
> > > Hi Michael,
> > > 
> > > On Fri, Oct 30, 2020 at 5:29 PM Michael S. Tsirkin <mst@redhat.com> 
wrote:
> > > > On Thu, Oct 29, 2020 at 04:25:41PM +0800, Bin Meng wrote:
> > > > > From: Bin Meng <bin.meng@windriver.com>
> > > > > 
> > > > > At present the virtio device config space access is handled by the
> > > > > virtio_config_readX() and virtio_config_writeX() APIs. They perform
> > > > > a sanity check on the result of address plus size against the config
> > > > > space size before the access occurs.
> > > > > 
> > > > > For unaligned access, the last converted naturally aligned access
> > > > > will fail the sanity check on 9pfs. For example, with a mount_tag
> > > > > `p9fs`, if guest software tries to read the mount_tag via a 4 byte
> > > > > read at the mount_tag offset which is not 4 byte aligned, the read
> > > > > result will be `p9\377\377`, which is wrong.
> > > > > 
> > > > > This changes the size of device config space to be a multiple of 4
> > > > > bytes so that correct result can be returned in all circumstances.
> > > > > 
> > > > > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> > > > 
> > > > The patch is ok, but I'd like to clarify the commit log.
> > > 
> > > Thanks for the review.
> > > 
> > > > If I understand correctly, what happens is:
> > > > - tag is set to a value that is not a multiple of 4 bytes
> > > 
> > > It's not about the mount_tag value, but the length of the mount_tag is
> > > 4.
> > > 
> > > > - guest attempts to read the last 4 bytes of the tag
> > > 
> > > Yep. So the config space of a 9pfs looks like the following:
> > > 
> > > offset: 0x14, size: 2 bytes indicating the length of the following
> > > mount_tag offset: 0x16, size: value of (offset 0x14).
> > > 
> > > When a 4-byte mount_tag is given, guest software is subject to read 4
> > > bytes (value read from offset 0x14) at offset 0x16.
> > 
> > Well looking at Linux guest code:
> > 
> > 
> > static inline void __virtio_cread_many(struct virtio_device *vdev,
> > 
> >                                        unsigned int offset,
> >                                        void *buf, size_t count, size_t
> >                                        bytes)
> > 
> > {
> > 
> >         u32 old, gen = vdev->config->generation ?
> >         
> >                 vdev->config->generation(vdev) : 0;
> >         
> >         int i;
> >         
> >         might_sleep();
> >         do {
> >         
> >                 old = gen;
> >                 
> >                 for (i = 0; i < count; i++)
> >                 
> >                         vdev->config->get(vdev, offset + bytes * i,
> >                         
> >                                           buf + i * bytes, bytes);
> >                 
> >                 gen = vdev->config->generation ?
> >                 
> >                         vdev->config->generation(vdev) : 0;
> >         
> >         } while (gen != old);
> > 
> > }
> > 
> > 
> > 
> > static inline void virtio_cread_bytes(struct virtio_device *vdev,
> > 
> >                                       unsigned int offset,
> >                                       void *buf, size_t len)
> > 
> > {
> > 
> >         __virtio_cread_many(vdev, offset, buf, len, 1);
> > 
> > }
> > 
> > and:
> >         virtio_cread_bytes(vdev, offsetof(struct virtio_9p_config, tag),
> >         
> >                            tag, tag_len);
> > 
> > So guest is doing multiple 1-byte reads.
> 
> Correct.
> 
> > Spec actually says:
> >         For device configuration access, the driver MUST use 8-bit wide
> >         accesses for 8-bit wide fields, 16-bit wide
> >         
> >         and aligned accesses for 16-bit wide fields and 32-bit wide and
> >         aligned accesses for 32-bit and 64-bit wide
> >         
> >         fields. For 64-bit fields, the driver MAY access each of the high
> >         and low 32-bit parts of the field independently.
> Yes.
> 
> > 9p was never standardized, but the linux header at least lists it as
> > follows:
> > 
> > struct virtio_9p_config {
> > 
> >         /* length of the tag name */
> >         __virtio16 tag_len;
> >         /* non-NULL terminated tag name */
> >         __u8 tag[0];
> > 
> > } __attribute__((packed));
> > 
> > In that sense tag is an 8 byte field.
> > 
> > So which guest reads tag using a 32 bit read, and why?
> 
> The obvious fix can be made to the guest which exposed this issue, but
> I was wondering whether we should enforce all virtio devices' config
> space size to be a multiple of 4 bytes which sounds more natural.
> 
> Regards,
> Bin

Personally I am not opposed for this to be addressed in qemu, but Michael 
should decide that.

But even if it would be addressed in qemu, I still think this would better be 
addressed on virtio side, not on virtio user side (9pfs, etc.), because that's 
a virtio implementation detail that might change in future.

What I definitely don't want to see though, is this alignment issue being 
handled with a hard coded value on user (9pfs) side as this patch does right 
now. Because that smells like it would be overseen if something changes on 
virtio side one day.

Best regards,
Christian Schoenebeck
diff mbox series

Patch

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 14371a7..e6a1432 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -201,6 +201,7 @@  static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
     V9fsVirtioState *v = VIRTIO_9P(dev);
     V9fsState *s = &v->state;
     FsDriverEntry *fse = get_fsdev_fsentry(s->fsconf.fsdev_id);
+    size_t config_size;
 
     if (qtest_enabled() && fse) {
         fse->export_flags |= V9FS_NO_PERF_WARN;
@@ -211,7 +212,8 @@  static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
     }
 
     v->config_size = sizeof(struct virtio_9p_config) + strlen(s->fsconf.tag);
-    virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size);
+    config_size = ROUND_UP(v->config_size, 4);
+    virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, config_size);
     v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
 }