diff mbox series

[V3,4/6] vDPA: !FEATURES_OK should not block querying device config space

Message ID 20220701132826.8132-5-lingshan.zhu@intel.com (mailing list archive)
State Not Applicable
Headers show
Series ifcvf/vDPA: support query device config space through netlink | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Zhu, Lingshan July 1, 2022, 1:28 p.m. UTC
Users may want to query the config space of a vDPA device,
to choose a appropriate one for a certain guest. This means the
users need to read the config space before FEATURES_OK, and
the existence of config space contents does not depend on
FEATURES_OK.

The spec says:
The device MUST allow reading of any device-specific configuration
field before FEATURES_OK is set by the driver. This includes
fields which are conditional on feature bits, as long as those
feature bits are offered by the device.

Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/vdpa.c | 8 --------
 1 file changed, 8 deletions(-)

Comments

Parav Pandit July 1, 2022, 10:12 p.m. UTC | #1
> From: Zhu Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 1, 2022 9:28 AM
> 
> Users may want to query the config space of a vDPA device, to choose a
> appropriate one for a certain guest. This means the users need to read the
> config space before FEATURES_OK, and the existence of config space
> contents does not depend on FEATURES_OK.
> 
> The spec says:
> The device MUST allow reading of any device-specific configuration field
> before FEATURES_OK is set by the driver. This includes fields which are
> conditional on feature bits, as long as those feature bits are offered by the
> device.
> 
> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
Fix is fine, but fixes tag needs correction described below.

Above commit id is 13 letters should be 12.
And 
It should be in format
Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")

Please use checkpatch.pl script before posting the patches to catch these errors.
There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.

> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>  drivers/vdpa/vdpa.c | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> 9b0e39b2f022..d76b22b2f7ae 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
> struct sk_buff *msg, u32 portid,  {
>  	u32 device_id;
>  	void *hdr;
> -	u8 status;
>  	int err;
> 
>  	down_read(&vdev->cf_lock);
> -	status = vdev->config->get_status(vdev);
> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> completed");
> -		err = -EAGAIN;
> -		goto out;
> -	}
> -
>  	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>  			  VDPA_CMD_DEV_CONFIG_GET);
>  	if (!hdr) {
> --
> 2.31.1
Zhu, Lingshan July 8, 2022, 6:22 a.m. UTC | #2
On 7/2/2022 6:12 AM, Parav Pandit wrote:
>
>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 1, 2022 9:28 AM
>>
>> Users may want to query the config space of a vDPA device, to choose a
>> appropriate one for a certain guest. This means the users need to read the
>> config space before FEATURES_OK, and the existence of config space
>> contents does not depend on FEATURES_OK.
>>
>> The spec says:
>> The device MUST allow reading of any device-specific configuration field
>> before FEATURES_OK is set by the driver. This includes fields which are
>> conditional on feature bits, as long as those feature bits are offered by the
>> device.
>>
>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
> Fix is fine, but fixes tag needs correction described below.
>
> Above commit id is 13 letters should be 12.
> And
> It should be in format
> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")
>
> Please use checkpatch.pl script before posting the patches to catch these errors.
> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.
strange, checkpatch.pl did not complain this, I will fix this tag. Thanks
>
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/vdpa.c | 8 --------
>>   1 file changed, 8 deletions(-)
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>> 9b0e39b2f022..d76b22b2f7ae 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>> struct sk_buff *msg, u32 portid,  {
>>   	u32 device_id;
>>   	void *hdr;
>> -	u8 status;
>>   	int err;
>>
>>   	down_read(&vdev->cf_lock);
>> -	status = vdev->config->get_status(vdev);
>> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>> completed");
>> -		err = -EAGAIN;
>> -		goto out;
>> -	}
>> -
>>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>   			  VDPA_CMD_DEV_CONFIG_GET);
>>   	if (!hdr) {
>> --
>> 2.31.1
Michael S. Tsirkin July 13, 2022, 5:23 a.m. UTC | #3
On Fri, Jul 01, 2022 at 10:12:49PM +0000, Parav Pandit wrote:
> 
> 
> > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > Sent: Friday, July 1, 2022 9:28 AM
> > 
> > Users may want to query the config space of a vDPA device, to choose a
> > appropriate one for a certain guest. This means the users need to read the
> > config space before FEATURES_OK, and the existence of config space
> > contents does not depend on FEATURES_OK.
> > 
> > The spec says:
> > The device MUST allow reading of any device-specific configuration field
> > before FEATURES_OK is set by the driver. This includes fields which are
c> > conditional on feature bits, as long as those feature bits are offered by the
> > device.
> > 
> > Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
> Fix is fine, but fixes tag needs correction described below.
> 
> Above commit id is 13 letters should be 12.
> And 
> It should be in format
> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")

Yea you normally use

--format='Fixes: %h (\"%s\")'


> Please use checkpatch.pl script before posting the patches to catch these errors.
> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.


I don't think checkpatch complains about this if for no other reason
that sometimes the 6 byte hash is not enough.

> > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > ---
> >  drivers/vdpa/vdpa.c | 8 --------
> >  1 file changed, 8 deletions(-)
> > 
> > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> > 9b0e39b2f022..d76b22b2f7ae 100644
> > --- a/drivers/vdpa/vdpa.c
> > +++ b/drivers/vdpa/vdpa.c
> > @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
> > struct sk_buff *msg, u32 portid,  {
> >  	u32 device_id;
> >  	void *hdr;
> > -	u8 status;
> >  	int err;
> > 
> >  	down_read(&vdev->cf_lock);
> > -	status = vdev->config->get_status(vdev);
> > -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> > -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> > completed");
> > -		err = -EAGAIN;
> > -		goto out;
> > -	}
> > -
> >  	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
> >  			  VDPA_CMD_DEV_CONFIG_GET);
> >  	if (!hdr) {
> > --
> > 2.31.1
Zhu, Lingshan July 13, 2022, 7:46 a.m. UTC | #4
On 7/13/2022 1:23 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 01, 2022 at 10:12:49PM +0000, Parav Pandit wrote:
>>
>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>> Sent: Friday, July 1, 2022 9:28 AM
>>>
>>> Users may want to query the config space of a vDPA device, to choose a
>>> appropriate one for a certain guest. This means the users need to read the
>>> config space before FEATURES_OK, and the existence of config space
>>> contents does not depend on FEATURES_OK.
>>>
>>> The spec says:
>>> The device MUST allow reading of any device-specific configuration field
>>> before FEATURES_OK is set by the driver. This includes fields which are
> c> > conditional on feature bits, as long as those feature bits are offered by the
>>> device.
yes
>>>
>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
>> Fix is fine, but fixes tag needs correction described below.
>>
>> Above commit id is 13 letters should be 12.
>> And
>> It should be in format
>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")
> Yea you normally use
>
> --format='Fixes: %h (\"%s\")'
Thanks, but I will drop this fix tag, since Parav suggest I drop the fix 
tag of the 3/6 patch which reporting device
feature bits to the upserspace(this fix is composed of several patches).
>
>
>> Please use checkpatch.pl script before posting the patches to catch these errors.
>> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.
>
> I don't think checkpatch complains about this if for no other reason
> that sometimes the 6 byte hash is not enough.
>
>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>> ---
>>>   drivers/vdpa/vdpa.c | 8 --------
>>>   1 file changed, 8 deletions(-)
>>>
>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>> --- a/drivers/vdpa/vdpa.c
>>> +++ b/drivers/vdpa/vdpa.c
>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>> struct sk_buff *msg, u32 portid,  {
>>>   	u32 device_id;
>>>   	void *hdr;
>>> -	u8 status;
>>>   	int err;
>>>
>>>   	down_read(&vdev->cf_lock);
>>> -	status = vdev->config->get_status(vdev);
>>> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>> completed");
>>> -		err = -EAGAIN;
>>> -		goto out;
>>> -	}
>>> -
>>>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>   			  VDPA_CMD_DEV_CONFIG_GET);
>>>   	if (!hdr) {
>>> --
>>> 2.31.1
Jason Wang July 28, 2022, 2:06 a.m. UTC | #5
在 2022/7/28 08:56, Si-Wei Liu 写道:
>
>
> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>
>>
>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>> Sorry to chime in late in the game. For some reason I couldn't get 
>>> to most emails for this discussion (I only subscribed to the 
>>> virtualization list), while I was taking off amongst the past few 
>>> weeks.
>>>
>>> It looks to me this patch is incomplete. Noted down the way in 
>>> vdpa_dev_net_config_fill(), we have the following:
>>>          features = vdev->config->get_driver_features(vdev);
>>>          if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>                                VDPA_ATTR_PAD))
>>>                  return -EMSGSIZE;
>>>
>>> Making call to .get_driver_features() doesn't make sense when 
>>> feature negotiation isn't complete. Neither should present 
>>> negotiated_features to userspace before negotiation is done.
>>>
>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably 
>>> should not show before negotiation is done - it depends on driver 
>>> features negotiated.
>> I have another patch in this series introduces device_features and 
>> will report device_features to the userspace even features 
>> negotiation not done. Because the spec says we should allow driver 
>> access the config space before FEATURES_OK.
> The config space can be accessed by guest before features_ok doesn't 
> necessarily mean the value is valid. 


It's valid as long as the device offers the feature:

"The device MUST allow reading of any device-specific configuration 
field before FEATURES_OK is set by the driver. This includes fields 
which are conditional on feature bits, as long as those feature bits are 
offered by the device."


> You may want to double check with Michael for what he quoted earlier:
>> Nope:
>>
>> 2.5.1  Driver Requirements: Device Configuration Space
>>
>> ...
>>
>> For optional configuration space fields, the driver MUST check that the corresponding feature is offered
>> before accessing that part of the configuration space.
>
> and how many driver bugs taking wrong assumption of the validity of 
> config space field without features_ok. I am not sure what use case 
> you want to expose config resister values for before features_ok, if 
> it's mostly for live migration I guess it's probably heading a wrong 
> direction.


I guess it's not for migration. For migration, a provision with the 
correct features/capability would be sufficient.

Thanks


>
>
>>>
>>>
>>> Last but not the least, this "vdpa dev config" command was not 
>>> designed to display the real config space register values in the 
>>> first place. Quoting the vdpa-dev(8) man page:
>>>
>>>> vdpa dev config show - Show configuration of specific device or all 
>>>> devices.
>>>> DEV - specifies the vdpa device to show its configuration. If this 
>>>> argument is omitted all devices configuration is listed.
>>> It doesn't say anything about configuration space or register values 
>>> in config space. As long as it can convey the config attribute when 
>>> instantiating vDPA device instance, and more importantly, the config 
>>> can be easily imported from or exported to userspace tools when 
>>> trying to reconstruct vdpa instance intact on destination host for 
>>> live migration, IMHO in my personal interpretation it doesn't matter 
>>> what the config space may present. It may be worth while adding a 
>>> new debug command to expose the real register value, but that's 
>>> another story.
>> I am not sure getting your points. vDPA now reports device feature 
>> bits(device_features) and negotiated feature bits(driver_features), 
>> and yes, the drivers features can be a subset of the device features; 
>> and the vDPA device features can be a subset of the management device 
>> features.
> What I said is after unblocking the conditional check, you'd have to 
> handle the case for each of the vdpa attribute when feature 
> negotiation is not yet done: basically the register values you got 
> from config space via the vdpa_get_config_unlocked() call is not 
> considered to be valid before features_ok (per-spec). Although in some 
> case you may get sane value, such behavior is generally undefined. If 
> you desire to show just the device_features alone without any config 
> space field, which the device had advertised *before feature 
> negotiation is complete*, that'll be fine. But looks to me this is not 
> how patch has been implemented. Probably need some more work?
>
> Regards,
> -Siwei
>
>>>
>>> Having said, please consider to drop the Fixes tag, as appears to me 
>>> you're proposing a new feature rather than fixing a real issue.
>> it's a new feature to report the device feature bits than only 
>> negotiated features, however this patch is a must, or it will block 
>> the device feature bits reporting. but I agree, the fix tag is not a 
>> must.
>>>
>>> Thanks,
>>> -Siwei
>>>
>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>
>>>>> Users may want to query the config space of a vDPA device, to choose a
>>>>> appropriate one for a certain guest. This means the users need to read the
>>>>> config space before FEATURES_OK, and the existence of config space
>>>>> contents does not depend on FEATURES_OK.
>>>>>
>>>>> The spec says:
>>>>> The device MUST allow reading of any device-specific configuration field
>>>>> before FEATURES_OK is set by the driver. This includes fields which are
>>>>> conditional on feature bits, as long as those feature bits are offered by the
>>>>> device.
>>>>>
>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
>>>> Fix is fine, but fixes tag needs correction described below.
>>>>
>>>> Above commit id is 13 letters should be 12.
>>>> And
>>>> It should be in format
>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")
>>>>
>>>> Please use checkpatch.pl script before posting the patches to catch these errors.
>>>> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.
>>>>
>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>> ---
>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>   1 file changed, 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>> --- a/drivers/vdpa/vdpa.c
>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>   	u32 device_id;
>>>>>   	void *hdr;
>>>>> -	u8 status;
>>>>>   	int err;
>>>>>
>>>>>   	down_read(&vdev->cf_lock);
>>>>> -	status = vdev->config->get_status(vdev);
>>>>> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>> completed");
>>>>> -		err = -EAGAIN;
>>>>> -		goto out;
>>>>> -	}
>>>>> -
>>>>>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>   			  VDPA_CMD_DEV_CONFIG_GET);
>>>>>   	if (!hdr) {
>>>>> --
>>>>> 2.31.1
>>>> _______________________________________________
>>>> Virtualization mailing list
>>>> Virtualization@lists.linux-foundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>>>
>>
>
Si-Wei Liu July 28, 2022, 7:08 a.m. UTC | #6
On 7/27/2022 7:06 PM, Jason Wang wrote:
>
> 在 2022/7/28 08:56, Si-Wei Liu 写道:
>>
>>
>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>> Sorry to chime in late in the game. For some reason I couldn't get 
>>>> to most emails for this discussion (I only subscribed to the 
>>>> virtualization list), while I was taking off amongst the past few 
>>>> weeks.
>>>>
>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>          features = vdev->config->get_driver_features(vdev);
>>>>          if (nla_put_u64_64bit(msg, 
>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>                                VDPA_ATTR_PAD))
>>>>                  return -EMSGSIZE;
>>>>
>>>> Making call to .get_driver_features() doesn't make sense when 
>>>> feature negotiation isn't complete. Neither should present 
>>>> negotiated_features to userspace before negotiation is done.
>>>>
>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably 
>>>> should not show before negotiation is done - it depends on driver 
>>>> features negotiated.
>>> I have another patch in this series introduces device_features and 
>>> will report device_features to the userspace even features 
>>> negotiation not done. Because the spec says we should allow driver 
>>> access the config space before FEATURES_OK.
>> The config space can be accessed by guest before features_ok doesn't 
>> necessarily mean the value is valid. 
>
>
> It's valid as long as the device offers the feature:
>
> "The device MUST allow reading of any device-specific configuration 
> field before FEATURES_OK is set by the driver. This includes fields 
> which are conditional on feature bits, as long as those feature bits 
> are offered by the device."
I guess this statement only conveys that the field in config space can 
be read before FEATURES_OK is set, though it does not *explicitly* 
states the validity of field.

And looking at:

"The mac address field always exists (though is only valid if 
VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS 
is set."

It appears to me there's a border line set between "exist" and "valid". 
If I understand the spec wording correctly, a spec-conforming device 
implementation may or may not offer valid status value in the config 
space when VIRTIO_NET_F_STATUS is offered, but before the feature is 
negotiated. On the other hand, config space should contain valid mac 
address the moment VIRTIO_NET_F_MAC feature is offered, regardless being 
negotiated or not. By that, there seems to be leeway for the device 
implementation to decide when config space field may become valid, 
though for most of QEMU's software virtio devices, valid value is 
present to config space the very first moment when feature is offered.

"If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac 
entry indicates the “physical” address of the network card, otherwise 
the driver would typically generate a random local MAC address."
"If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status 
comes from the bottom bit of status. Otherwise, the driver assumes it’s 
active."

And also there are special cases where the read of specific 
configuration space field MUST be deferred to until FEATURES_OK is set:

"If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode 
can be read or set through the writeback field. 0 corresponds to a 
writethrough cache, 1 to a writeback cache11. The cache mode after reset 
can be either writeback or writethrough. The actual mode can be 
determined by reading writeback after feature negotiation."
"The driver MUST NOT read writeback before setting the FEATURES_OK 
device status bit."
"If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not, 
the device MUST initialize writeback to 0."

Since the spec doesn't explicitly mandate the validity of each config 
space field when feature of concern is offered, to be safer we'd have to 
live with odd device implementation. I know for sure QEMU software 
devices won't for 99% of these cases, but that's not what is currently 
defined in the spec.

>
>
>> You may want to double check with Michael for what he quoted earlier:
>>> Nope:
>>>
>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>
>>> ...
>>>
>>> For optional configuration space fields, the driver MUST check that 
>>> the corresponding feature is offered
>>> before accessing that part of the configuration space.
>>
>> and how many driver bugs taking wrong assumption of the validity of 
>> config space field without features_ok. I am not sure what use case 
>> you want to expose config resister values for before features_ok, if 
>> it's mostly for live migration I guess it's probably heading a wrong 
>> direction.
>
>
> I guess it's not for migration. 
Then what's the other possible use case than live migration, were to 
expose config space values? Troubleshooting config space discrepancy 
between vDPA and the emulated virtio device in userspace? Or tracking 
changes in config space across feature negotiation, but for what? It'd 
be beneficial to the interface design if the specific use case can be 
clearly described...


> For migration, a provision with the correct features/capability would 
> be sufficient.
Right, that's what I thought too. It doesn't need to expose config space 
values, simply exporting all attributes for vdpa device creation will do 
the work.

-Siwei

>
> Thanks
>
>
>>
>>
>>>>
>>>>
>>>> Last but not the least, this "vdpa dev config" command was not 
>>>> designed to display the real config space register values in the 
>>>> first place. Quoting the vdpa-dev(8) man page:
>>>>
>>>>> vdpa dev config show - Show configuration of specific device or 
>>>>> all devices.
>>>>> DEV - specifies the vdpa device to show its configuration. If this 
>>>>> argument is omitted all devices configuration is listed.
>>>> It doesn't say anything about configuration space or register 
>>>> values in config space. As long as it can convey the config 
>>>> attribute when instantiating vDPA device instance, and more 
>>>> importantly, the config can be easily imported from or exported to 
>>>> userspace tools when trying to reconstruct vdpa instance intact on 
>>>> destination host for live migration, IMHO in my personal 
>>>> interpretation it doesn't matter what the config space may present. 
>>>> It may be worth while adding a new debug command to expose the real 
>>>> register value, but that's another story.
>>> I am not sure getting your points. vDPA now reports device feature 
>>> bits(device_features) and negotiated feature bits(driver_features), 
>>> and yes, the drivers features can be a subset of the device 
>>> features; and the vDPA device features can be a subset of the 
>>> management device features.
>> What I said is after unblocking the conditional check, you'd have to 
>> handle the case for each of the vdpa attribute when feature 
>> negotiation is not yet done: basically the register values you got 
>> from config space via the vdpa_get_config_unlocked() call is not 
>> considered to be valid before features_ok (per-spec). Although in 
>> some case you may get sane value, such behavior is generally 
>> undefined. If you desire to show just the device_features alone 
>> without any config space field, which the device had advertised 
>> *before feature negotiation is complete*, that'll be fine. But looks 
>> to me this is not how patch has been implemented. Probably need some 
>> more work?
>>
>> Regards,
>> -Siwei
>>
>>>>
>>>> Having said, please consider to drop the Fixes tag, as appears to 
>>>> me you're proposing a new feature rather than fixing a real issue.
>>> it's a new feature to report the device feature bits than only 
>>> negotiated features, however this patch is a must, or it will block 
>>> the device feature bits reporting. but I agree, the fix tag is not a 
>>> must.
>>>>
>>>> Thanks,
>>>> -Siwei
>>>>
>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>
>>>>>> Users may want to query the config space of a vDPA device, to 
>>>>>> choose a
>>>>>> appropriate one for a certain guest. This means the users need to 
>>>>>> read the
>>>>>> config space before FEATURES_OK, and the existence of config space
>>>>>> contents does not depend on FEATURES_OK.
>>>>>>
>>>>>> The spec says:
>>>>>> The device MUST allow reading of any device-specific 
>>>>>> configuration field
>>>>>> before FEATURES_OK is set by the driver. This includes fields 
>>>>>> which are
>>>>>> conditional on feature bits, as long as those feature bits are 
>>>>>> offered by the
>>>>>> device.
>>>>>>
>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if 
>>>>>> FEATURES_OK)
>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>
>>>>> Above commit id is 13 letters should be 12.
>>>>> And
>>>>> It should be in format
>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if 
>>>>> FEATURES_OK")
>>>>>
>>>>> Please use checkpatch.pl script before posting the patches to 
>>>>> catch these errors.
>>>>> There is a bot that looks at the fixes tag and identifies the 
>>>>> right kernel version to apply this fix.
>>>>>
>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>> ---
>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>   1 file changed, 8 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>       u32 device_id;
>>>>>>       void *hdr;
>>>>>> -    u8 status;
>>>>>>       int err;
>>>>>>
>>>>>>       down_read(&vdev->cf_lock);
>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>> completed");
>>>>>> -        err = -EAGAIN;
>>>>>> -        goto out;
>>>>>> -    }
>>>>>> -
>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
>>>>>>       if (!hdr) {
>>>>>> -- 
>>>>>> 2.31.1
>>>>> _______________________________________________
>>>>> Virtualization mailing list
>>>>> Virtualization@lists.linux-foundation.org
>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$ 
>>>>
>>>>
>>>
>>
>
Jason Wang July 28, 2022, 7:36 a.m. UTC | #7
On Thu, Jul 28, 2022 at 3:09 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 7/27/2022 7:06 PM, Jason Wang wrote:
> >
> > 在 2022/7/28 08:56, Si-Wei Liu 写道:
> >>
> >>
> >> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
> >>>
> >>>
> >>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
> >>>> Sorry to chime in late in the game. For some reason I couldn't get
> >>>> to most emails for this discussion (I only subscribed to the
> >>>> virtualization list), while I was taking off amongst the past few
> >>>> weeks.
> >>>>
> >>>> It looks to me this patch is incomplete. Noted down the way in
> >>>> vdpa_dev_net_config_fill(), we have the following:
> >>>>          features = vdev->config->get_driver_features(vdev);
> >>>>          if (nla_put_u64_64bit(msg,
> >>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
> >>>>                                VDPA_ATTR_PAD))
> >>>>                  return -EMSGSIZE;
> >>>>
> >>>> Making call to .get_driver_features() doesn't make sense when
> >>>> feature negotiation isn't complete. Neither should present
> >>>> negotiated_features to userspace before negotiation is done.
> >>>>
> >>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably
> >>>> should not show before negotiation is done - it depends on driver
> >>>> features negotiated.
> >>> I have another patch in this series introduces device_features and
> >>> will report device_features to the userspace even features
> >>> negotiation not done. Because the spec says we should allow driver
> >>> access the config space before FEATURES_OK.
> >> The config space can be accessed by guest before features_ok doesn't
> >> necessarily mean the value is valid.
> >
> >
> > It's valid as long as the device offers the feature:
> >
> > "The device MUST allow reading of any device-specific configuration
> > field before FEATURES_OK is set by the driver. This includes fields
> > which are conditional on feature bits, as long as those feature bits
> > are offered by the device."
> I guess this statement only conveys that the field in config space can
> be read before FEATURES_OK is set, though it does not *explicitly*
> states the validity of field.

My understanding is that it should be valid as long as the device
offers the feature.

For example, if _MQ is offered by device, the max_virt_queue_pairs is
always valid and can be read from the driver no matter whether _MQ is
negotiated.

>
> And looking at:
>
> "The mac address field always exists (though is only valid if
> VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS
> is set."
>
> It appears to me there's a border line set between "exist" and "valid".
> If I understand the spec wording correctly, a spec-conforming device
> implementation may or may not offer valid status value in the config
> space when VIRTIO_NET_F_STATUS is offered, but before the feature is
> negotiated.

That's not what I read, maybe Michael can clarify this.

> On the other hand, config space should contain valid mac
> address the moment VIRTIO_NET_F_MAC feature is offered, regardless being
> negotiated or not.

I agree here.

>By that, there seems to be leeway for the device
> implementation to decide when config space field may become valid,
> though for most of QEMU's software virtio devices, valid value is
> present to config space the very first moment when feature is offered.
>
> "If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac
> entry indicates the “physical” address of the network card, otherwise
> the driver would typically generate a random local MAC address."
> "If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status
> comes from the bottom bit of status. Otherwise, the driver assumes it’s
> active."

This is mostly the way how drivers that don't support _F_STATUS work.

>
> And also there are special cases where the read of specific
> configuration space field MUST be deferred to until FEATURES_OK is set:
>
> "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode
> can be read or set through the writeback field. 0 corresponds to a
> writethrough cache, 1 to a writeback cache11. The cache mode after reset
> can be either writeback or writethrough. The actual mode can be
> determined by reading writeback after feature negotiation."
> "The driver MUST NOT read writeback before setting the FEATURES_OK
> device status bit."

This seems to conflict with the normatives I quoted above, and I don't
get why we need this.

> "If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not,
> the device MUST initialize writeback to 0."
>
> Since the spec doesn't explicitly mandate the validity of each config
> space field when feature of concern is offered, to be safer we'd have to
> live with odd device implementation. I know for sure QEMU software
> devices won't for 99% of these cases, but that's not what is currently
> defined in the spec.
>
> >
> >
> >> You may want to double check with Michael for what he quoted earlier:
> >>> Nope:
> >>>
> >>> 2.5.1  Driver Requirements: Device Configuration Space
> >>>
> >>> ...
> >>>
> >>> For optional configuration space fields, the driver MUST check that
> >>> the corresponding feature is offered
> >>> before accessing that part of the configuration space.
> >>
> >> and how many driver bugs taking wrong assumption of the validity of
> >> config space field without features_ok. I am not sure what use case
> >> you want to expose config resister values for before features_ok, if
> >> it's mostly for live migration I guess it's probably heading a wrong
> >> direction.
> >
> >
> > I guess it's not for migration.
> Then what's the other possible use case than live migration, were to
> expose config space values? Troubleshooting config space discrepancy
> between vDPA and the emulated virtio device in userspace? Or tracking
> changes in config space across feature negotiation, but for what? It'd
> be beneficial to the interface design if the specific use case can be
> clearly described...

Monitoring or debugging I guess.

Thanks

>
>
> > For migration, a provision with the correct features/capability would
> > be sufficient.
> Right, that's what I thought too. It doesn't need to expose config space
> values, simply exporting all attributes for vdpa device creation will do
> the work.
>
> -Siwei
>
> >
> > Thanks
> >
> >
> >>
> >>
> >>>>
> >>>>
> >>>> Last but not the least, this "vdpa dev config" command was not
> >>>> designed to display the real config space register values in the
> >>>> first place. Quoting the vdpa-dev(8) man page:
> >>>>
> >>>>> vdpa dev config show - Show configuration of specific device or
> >>>>> all devices.
> >>>>> DEV - specifies the vdpa device to show its configuration. If this
> >>>>> argument is omitted all devices configuration is listed.
> >>>> It doesn't say anything about configuration space or register
> >>>> values in config space. As long as it can convey the config
> >>>> attribute when instantiating vDPA device instance, and more
> >>>> importantly, the config can be easily imported from or exported to
> >>>> userspace tools when trying to reconstruct vdpa instance intact on
> >>>> destination host for live migration, IMHO in my personal
> >>>> interpretation it doesn't matter what the config space may present.
> >>>> It may be worth while adding a new debug command to expose the real
> >>>> register value, but that's another story.
> >>> I am not sure getting your points. vDPA now reports device feature
> >>> bits(device_features) and negotiated feature bits(driver_features),
> >>> and yes, the drivers features can be a subset of the device
> >>> features; and the vDPA device features can be a subset of the
> >>> management device features.
> >> What I said is after unblocking the conditional check, you'd have to
> >> handle the case for each of the vdpa attribute when feature
> >> negotiation is not yet done: basically the register values you got
> >> from config space via the vdpa_get_config_unlocked() call is not
> >> considered to be valid before features_ok (per-spec). Although in
> >> some case you may get sane value, such behavior is generally
> >> undefined. If you desire to show just the device_features alone
> >> without any config space field, which the device had advertised
> >> *before feature negotiation is complete*, that'll be fine. But looks
> >> to me this is not how patch has been implemented. Probably need some
> >> more work?
> >>
> >> Regards,
> >> -Siwei
> >>
> >>>>
> >>>> Having said, please consider to drop the Fixes tag, as appears to
> >>>> me you're proposing a new feature rather than fixing a real issue.
> >>> it's a new feature to report the device feature bits than only
> >>> negotiated features, however this patch is a must, or it will block
> >>> the device feature bits reporting. but I agree, the fix tag is not a
> >>> must.
> >>>>
> >>>> Thanks,
> >>>> -Siwei
> >>>>
> >>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
> >>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
> >>>>>> Sent: Friday, July 1, 2022 9:28 AM
> >>>>>>
> >>>>>> Users may want to query the config space of a vDPA device, to
> >>>>>> choose a
> >>>>>> appropriate one for a certain guest. This means the users need to
> >>>>>> read the
> >>>>>> config space before FEATURES_OK, and the existence of config space
> >>>>>> contents does not depend on FEATURES_OK.
> >>>>>>
> >>>>>> The spec says:
> >>>>>> The device MUST allow reading of any device-specific
> >>>>>> configuration field
> >>>>>> before FEATURES_OK is set by the driver. This includes fields
> >>>>>> which are
> >>>>>> conditional on feature bits, as long as those feature bits are
> >>>>>> offered by the
> >>>>>> device.
> >>>>>>
> >>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if
> >>>>>> FEATURES_OK)
> >>>>> Fix is fine, but fixes tag needs correction described below.
> >>>>>
> >>>>> Above commit id is 13 letters should be 12.
> >>>>> And
> >>>>> It should be in format
> >>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
> >>>>> FEATURES_OK")
> >>>>>
> >>>>> Please use checkpatch.pl script before posting the patches to
> >>>>> catch these errors.
> >>>>> There is a bot that looks at the fixes tag and identifies the
> >>>>> right kernel version to apply this fix.
> >>>>>
> >>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
> >>>>>> ---
> >>>>>>   drivers/vdpa/vdpa.c | 8 --------
> >>>>>>   1 file changed, 8 deletions(-)
> >>>>>>
> >>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> >>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
> >>>>>> --- a/drivers/vdpa/vdpa.c
> >>>>>> +++ b/drivers/vdpa/vdpa.c
> >>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
> >>>>>> struct sk_buff *msg, u32 portid,  {
> >>>>>>       u32 device_id;
> >>>>>>       void *hdr;
> >>>>>> -    u8 status;
> >>>>>>       int err;
> >>>>>>
> >>>>>>       down_read(&vdev->cf_lock);
> >>>>>> -    status = vdev->config->get_status(vdev);
> >>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> >>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> >>>>>> completed");
> >>>>>> -        err = -EAGAIN;
> >>>>>> -        goto out;
> >>>>>> -    }
> >>>>>> -
> >>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
> >>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
> >>>>>>       if (!hdr) {
> >>>>>> --
> >>>>>> 2.31.1
> >>>>> _______________________________________________
> >>>>> Virtualization mailing list
> >>>>> Virtualization@lists.linux-foundation.org
> >>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$
> >>>>
> >>>>
> >>>
> >>
> >
>
Zhu, Lingshan July 28, 2022, 7:44 a.m. UTC | #8
On 7/28/2022 3:36 PM, Jason Wang wrote:
> On Thu, Jul 28, 2022 at 3:09 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 7/27/2022 7:06 PM, Jason Wang wrote:
>>> 在 2022/7/28 08:56, Si-Wei Liu 写道:
>>>>
>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>
>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>> Sorry to chime in late in the game. For some reason I couldn't get
>>>>>> to most emails for this discussion (I only subscribed to the
>>>>>> virtualization list), while I was taking off amongst the past few
>>>>>> weeks.
>>>>>>
>>>>>> It looks to me this patch is incomplete. Noted down the way in
>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>                                 VDPA_ATTR_PAD))
>>>>>>                   return -EMSGSIZE;
>>>>>>
>>>>>> Making call to .get_driver_features() doesn't make sense when
>>>>>> feature negotiation isn't complete. Neither should present
>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>
>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably
>>>>>> should not show before negotiation is done - it depends on driver
>>>>>> features negotiated.
>>>>> I have another patch in this series introduces device_features and
>>>>> will report device_features to the userspace even features
>>>>> negotiation not done. Because the spec says we should allow driver
>>>>> access the config space before FEATURES_OK.
>>>> The config space can be accessed by guest before features_ok doesn't
>>>> necessarily mean the value is valid.
>>>
>>> It's valid as long as the device offers the feature:
>>>
>>> "The device MUST allow reading of any device-specific configuration
>>> field before FEATURES_OK is set by the driver. This includes fields
>>> which are conditional on feature bits, as long as those feature bits
>>> are offered by the device."
>> I guess this statement only conveys that the field in config space can
>> be read before FEATURES_OK is set, though it does not *explicitly*
>> states the validity of field.
> My understanding is that it should be valid as long as the device
> offers the feature.
>
> For example, if _MQ is offered by device, the max_virt_queue_pairs is
> always valid and can be read from the driver no matter whether _MQ is
> negotiated.
agreed, that's also my understanding
>
>> And looking at:
>>
>> "The mac address field always exists (though is only valid if
>> VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS
>> is set."
>>
>> It appears to me there's a border line set between "exist" and "valid".
>> If I understand the spec wording correctly, a spec-conforming device
>> implementation may or may not offer valid status value in the config
>> space when VIRTIO_NET_F_STATUS is offered, but before the feature is
>> negotiated.
> That's not what I read, maybe Michael can clarify this.
>
>> On the other hand, config space should contain valid mac
>> address the moment VIRTIO_NET_F_MAC feature is offered, regardless being
>> negotiated or not.
> I agree here.
>
>> By that, there seems to be leeway for the device
>> implementation to decide when config space field may become valid,
>> though for most of QEMU's software virtio devices, valid value is
>> present to config space the very first moment when feature is offered.
>>
>> "If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac
>> entry indicates the “physical” address of the network card, otherwise
>> the driver would typically generate a random local MAC address."
>> "If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status
>> comes from the bottom bit of status. Otherwise, the driver assumes it’s
>> active."
> This is mostly the way how drivers that don't support _F_STATUS work.
>
>> And also there are special cases where the read of specific
>> configuration space field MUST be deferred to until FEATURES_OK is set:
>>
>> "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode
>> can be read or set through the writeback field. 0 corresponds to a
>> writethrough cache, 1 to a writeback cache11. The cache mode after reset
>> can be either writeback or writethrough. The actual mode can be
>> determined by reading writeback after feature negotiation."
>> "The driver MUST NOT read writeback before setting the FEATURES_OK
>> device status bit."
> This seems to conflict with the normatives I quoted above, and I don't
> get why we need this.
>
>> "If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not,
>> the device MUST initialize writeback to 0."
>>
>> Since the spec doesn't explicitly mandate the validity of each config
>> space field when feature of concern is offered, to be safer we'd have to
>> live with odd device implementation. I know for sure QEMU software
>> devices won't for 99% of these cases, but that's not what is currently
>> defined in the spec.
>>
>>>
>>>> You may want to double check with Michael for what he quoted earlier:
>>>>> Nope:
>>>>>
>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>
>>>>> ...
>>>>>
>>>>> For optional configuration space fields, the driver MUST check that
>>>>> the corresponding feature is offered
>>>>> before accessing that part of the configuration space.
>>>> and how many driver bugs taking wrong assumption of the validity of
>>>> config space field without features_ok. I am not sure what use case
>>>> you want to expose config resister values for before features_ok, if
>>>> it's mostly for live migration I guess it's probably heading a wrong
>>>> direction.
>>>
>>> I guess it's not for migration.
>> Then what's the other possible use case than live migration, were to
>> expose config space values? Troubleshooting config space discrepancy
>> between vDPA and the emulated virtio device in userspace? Or tracking
>> changes in config space across feature negotiation, but for what? It'd
>> be beneficial to the interface design if the specific use case can be
>> clearly described...
> Monitoring or debugging I guess.
>
> Thanks
>
>>
>>> For migration, a provision with the correct features/capability would
>>> be sufficient.
>> Right, that's what I thought too. It doesn't need to expose config space
>> values, simply exporting all attributes for vdpa device creation will do
>> the work.
>>
>> -Siwei
>>
>>> Thanks
>>>
>>>
>>>>
>>>>>>
>>>>>> Last but not the least, this "vdpa dev config" command was not
>>>>>> designed to display the real config space register values in the
>>>>>> first place. Quoting the vdpa-dev(8) man page:
>>>>>>
>>>>>>> vdpa dev config show - Show configuration of specific device or
>>>>>>> all devices.
>>>>>>> DEV - specifies the vdpa device to show its configuration. If this
>>>>>>> argument is omitted all devices configuration is listed.
>>>>>> It doesn't say anything about configuration space or register
>>>>>> values in config space. As long as it can convey the config
>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>> importantly, the config can be easily imported from or exported to
>>>>>> userspace tools when trying to reconstruct vdpa instance intact on
>>>>>> destination host for live migration, IMHO in my personal
>>>>>> interpretation it doesn't matter what the config space may present.
>>>>>> It may be worth while adding a new debug command to expose the real
>>>>>> register value, but that's another story.
>>>>> I am not sure getting your points. vDPA now reports device feature
>>>>> bits(device_features) and negotiated feature bits(driver_features),
>>>>> and yes, the drivers features can be a subset of the device
>>>>> features; and the vDPA device features can be a subset of the
>>>>> management device features.
>>>> What I said is after unblocking the conditional check, you'd have to
>>>> handle the case for each of the vdpa attribute when feature
>>>> negotiation is not yet done: basically the register values you got
>>>> from config space via the vdpa_get_config_unlocked() call is not
>>>> considered to be valid before features_ok (per-spec). Although in
>>>> some case you may get sane value, such behavior is generally
>>>> undefined. If you desire to show just the device_features alone
>>>> without any config space field, which the device had advertised
>>>> *before feature negotiation is complete*, that'll be fine. But looks
>>>> to me this is not how patch has been implemented. Probably need some
>>>> more work?
>>>>
>>>> Regards,
>>>> -Siwei
>>>>
>>>>>> Having said, please consider to drop the Fixes tag, as appears to
>>>>>> me you're proposing a new feature rather than fixing a real issue.
>>>>> it's a new feature to report the device feature bits than only
>>>>> negotiated features, however this patch is a must, or it will block
>>>>> the device feature bits reporting. but I agree, the fix tag is not a
>>>>> must.
>>>>>> Thanks,
>>>>>> -Siwei
>>>>>>
>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>
>>>>>>>> Users may want to query the config space of a vDPA device, to
>>>>>>>> choose a
>>>>>>>> appropriate one for a certain guest. This means the users need to
>>>>>>>> read the
>>>>>>>> config space before FEATURES_OK, and the existence of config space
>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>
>>>>>>>> The spec says:
>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>> configuration field
>>>>>>>> before FEATURES_OK is set by the driver. This includes fields
>>>>>>>> which are
>>>>>>>> conditional on feature bits, as long as those feature bits are
>>>>>>>> offered by the
>>>>>>>> device.
>>>>>>>>
>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if
>>>>>>>> FEATURES_OK)
>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>
>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>> And
>>>>>>> It should be in format
>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
>>>>>>> FEATURES_OK")
>>>>>>>
>>>>>>> Please use checkpatch.pl script before posting the patches to
>>>>>>> catch these errors.
>>>>>>> There is a bot that looks at the fixes tag and identifies the
>>>>>>> right kernel version to apply this fix.
>>>>>>>
>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> ---
>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>        u32 device_id;
>>>>>>>>        void *hdr;
>>>>>>>> -    u8 status;
>>>>>>>>        int err;
>>>>>>>>
>>>>>>>>        down_read(&vdev->cf_lock);
>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>> completed");
>>>>>>>> -        err = -EAGAIN;
>>>>>>>> -        goto out;
>>>>>>>> -    }
>>>>>>>> -
>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>>>                  VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>        if (!hdr) {
>>>>>>>> --
>>>>>>>> 2.31.1
>>>>>>> _______________________________________________
>>>>>>> Virtualization mailing list
>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$
>>>>>>
Michael S. Tsirkin July 28, 2022, 11:28 a.m. UTC | #9
On Thu, Jul 28, 2022 at 01:20:26AM -0700, Si-Wei Liu wrote:
> Hi Michael,
> 
> Could you please comment on the different wording between "exist" and "valid"
> in spec? Having seen quite a few relevant discussions regarding MTU validation
> and VERSION_1 negotiation on s390, I was in the impression this is not the
> first time people getting confused because of ambiguity of random wording
> without detailed example helping to clarify around the context, or due lack of
> clear definition set ahead. I like your idea to keep things consistent
> (conditional depending on feature presence), however, without proper
> interpretation of how spec is supposed to work, we are on a slippery slope
> towards inconsistency.
> 
> On 7/28/2022 12:36 AM, Jason Wang wrote:
> 
>         And looking at:
> 
>         "The mac address field always exists (though is only valid if
>         VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS
>         is set."
> 
>         It appears to me there's a border line set between "exist" and "valid".
>         If I understand the spec wording correctly, a spec-conforming device
>         implementation may or may not offer valid status value in the config
>         space when VIRTIO_NET_F_STATUS is offered, but before the feature is
>         negotiated.
> 
>     That's not what I read, maybe Michael can clarify this.
> 
> 
> 
> And Jason and I find below normatives are conflict with each other.
> 
>         "The device MUST allow reading of any device-specific configuration
>         field before FEATURES_OK is set by the driver. This includes fields
>         which are conditional on feature bits, as long as those feature bits are
>         offered by the device."


So I proposed this back in April:

https://lists.oasis-open.org/archives/virtio-comment/202201/msg00068.html

I intended this for 1.2 but it quickly became clear it won't make it
in time. Working on reviving the proposal and addressing the comments.




> 
>     ...
> 
>         And also there are special cases where the read of specific
>         configuration space field MUST be deferred to until FEATURES_OK is set:
> 
>         "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode
>         can be read or set through the writeback field. 0 corresponds to a
>         writethrough cache, 1 to a writeback cache11. The cache mode after reset
>         can be either writeback or writethrough. The actual mode can be
>         determined by reading writeback after feature negotiation."
>         "The driver MUST NOT read writeback before setting the FEATURES_OK
>         device status bit."
> 
>     This seems to conflict with the normatives I quoted above, and I don't
>     get why we need this.
> 
> 
> 
> Thanks,
> -Siwei


The last one I take to mean writeback is special.
I am not sure why it should be. Paolo you proposed this text could
you comment please?

Thanks!
Michael S. Tsirkin July 28, 2022, 11:35 a.m. UTC | #10
On Thu, Jul 28, 2022 at 12:08:53AM -0700, Si-Wei Liu wrote:
> 
> 
> On 7/27/2022 7:06 PM, Jason Wang wrote:
> > 
> > 在 2022/7/28 08:56, Si-Wei Liu 写道:
> > > 
> > > 
> > > On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
> > > > 
> > > > 
> > > > On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
> > > > > Sorry to chime in late in the game. For some reason I
> > > > > couldn't get to most emails for this discussion (I only
> > > > > subscribed to the virtualization list), while I was taking
> > > > > off amongst the past few weeks.
> > > > > 
> > > > > It looks to me this patch is incomplete. Noted down the way
> > > > > in vdpa_dev_net_config_fill(), we have the following:
> > > > >          features = vdev->config->get_driver_features(vdev);
> > > > >          if (nla_put_u64_64bit(msg,
> > > > > VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
> > > > >                                VDPA_ATTR_PAD))
> > > > >                  return -EMSGSIZE;
> > > > > 
> > > > > Making call to .get_driver_features() doesn't make sense
> > > > > when feature negotiation isn't complete. Neither should
> > > > > present negotiated_features to userspace before negotiation
> > > > > is done.
> > > > > 
> > > > > Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
> > > > > probably should not show before negotiation is done - it
> > > > > depends on driver features negotiated.
> > > > I have another patch in this series introduces device_features
> > > > and will report device_features to the userspace even features
> > > > negotiation not done. Because the spec says we should allow
> > > > driver access the config space before FEATURES_OK.
> > > The config space can be accessed by guest before features_ok doesn't
> > > necessarily mean the value is valid.
> > 
> > 
> > It's valid as long as the device offers the feature:
> > 
> > "The device MUST allow reading of any device-specific configuration
> > field before FEATURES_OK is set by the driver. This includes fields
> > which are conditional on feature bits, as long as those feature bits are
> > offered by the device."
> I guess this statement only conveys that the field in config space can be
> read before FEATURES_OK is set, though it does not *explicitly* states the
> validity of field.
> 
> And looking at:
> 
> "The mac address field always exists (though is only valid if
> VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS is
> set."
> 
> It appears to me there's a border line set between "exist" and "valid". If I
> understand the spec wording correctly, a spec-conforming device
> implementation may or may not offer valid status value in the config space
> when VIRTIO_NET_F_STATUS is offered, but before the feature is negotiated.
> On the other hand, config space should contain valid mac address the moment
> VIRTIO_NET_F_MAC feature is offered, regardless being negotiated or not. By
> that, there seems to be leeway for the device implementation to decide when
> config space field may become valid, though for most of QEMU's software
> virtio devices, valid value is present to config space the very first moment
> when feature is offered.
> 
> "If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac
> entry indicates the “physical” address of the network card, otherwise the
> driver would typically generate a random local MAC address."
> "If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status comes
> from the bottom bit of status. Otherwise, the driver assumes it’s active."
> 
> And also there are special cases where the read of specific configuration
> space field MUST be deferred to until FEATURES_OK is set:
> 
> "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode can be
> read or set through the writeback field. 0 corresponds to a writethrough
> cache, 1 to a writeback cache11. The cache mode after reset can be either
> writeback or writethrough. The actual mode can be determined by reading
> writeback after feature negotiation."
> "The driver MUST NOT read writeback before setting the FEATURES_OK device
> status bit."
> "If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not, the
> device MUST initialize writeback to 0."
> 
> Since the spec doesn't explicitly mandate the validity of each config space
> field when feature of concern is offered, to be safer we'd have to live with
> odd device implementation. I know for sure QEMU software devices won't for
> 99% of these cases, but that's not what is currently defined in the spec.


Thanks for raising this subject. I started working on this in April:

https://lists.oasis-open.org/archives/virtio-comment/202201/msg00068.html

working now to address the comments.


> > 
> > 
> > > You may want to double check with Michael for what he quoted earlier:
> > > > Nope:
> > > > 
> > > > 2.5.1  Driver Requirements: Device Configuration Space
> > > > 
> > > > ...
> > > > 
> > > > For optional configuration space fields, the driver MUST check
> > > > that the corresponding feature is offered
> > > > before accessing that part of the configuration space.
> > > 
> > > and how many driver bugs taking wrong assumption of the validity of
> > > config space field without features_ok. I am not sure what use case
> > > you want to expose config resister values for before features_ok, if
> > > it's mostly for live migration I guess it's probably heading a wrong
> > > direction.
> > 
> > 
> > I guess it's not for migration.
> Then what's the other possible use case than live migration, were to expose
> config space values? Troubleshooting config space discrepancy between vDPA
> and the emulated virtio device in userspace? Or tracking changes in config
> space across feature negotiation, but for what? It'd be beneficial to the
> interface design if the specific use case can be clearly described...
> 
> 
> > For migration, a provision with the correct features/capability would be
> > sufficient.
> Right, that's what I thought too. It doesn't need to expose config space
> values, simply exporting all attributes for vdpa device creation will do the
> work.
> 
> -Siwei
> 
> > 
> > Thanks
> > 
> > 
> > > 
> > > 
> > > > > 
> > > > > 
> > > > > Last but not the least, this "vdpa dev config" command was
> > > > > not designed to display the real config space register
> > > > > values in the first place. Quoting the vdpa-dev(8) man page:
> > > > > 
> > > > > > vdpa dev config show - Show configuration of specific
> > > > > > device or all devices.
> > > > > > DEV - specifies the vdpa device to show its
> > > > > > configuration. If this argument is omitted all devices
> > > > > > configuration is listed.
> > > > > It doesn't say anything about configuration space or
> > > > > register values in config space. As long as it can convey
> > > > > the config attribute when instantiating vDPA device
> > > > > instance, and more importantly, the config can be easily
> > > > > imported from or exported to userspace tools when trying to
> > > > > reconstruct vdpa instance intact on destination host for
> > > > > live migration, IMHO in my personal interpretation it
> > > > > doesn't matter what the config space may present. It may be
> > > > > worth while adding a new debug command to expose the real
> > > > > register value, but that's another story.
> > > > I am not sure getting your points. vDPA now reports device
> > > > feature bits(device_features) and negotiated feature
> > > > bits(driver_features), and yes, the drivers features can be a
> > > > subset of the device features; and the vDPA device features can
> > > > be a subset of the management device features.
> > > What I said is after unblocking the conditional check, you'd have to
> > > handle the case for each of the vdpa attribute when feature
> > > negotiation is not yet done: basically the register values you got
> > > from config space via the vdpa_get_config_unlocked() call is not
> > > considered to be valid before features_ok (per-spec). Although in
> > > some case you may get sane value, such behavior is generally
> > > undefined. If you desire to show just the device_features alone
> > > without any config space field, which the device had advertised
> > > *before feature negotiation is complete*, that'll be fine. But looks
> > > to me this is not how patch has been implemented. Probably need some
> > > more work?
> > > 
> > > Regards,
> > > -Siwei
> > > 
> > > > > 
> > > > > Having said, please consider to drop the Fixes tag, as
> > > > > appears to me you're proposing a new feature rather than
> > > > > fixing a real issue.
> > > > it's a new feature to report the device feature bits than only
> > > > negotiated features, however this patch is a must, or it will
> > > > block the device feature bits reporting. but I agree, the fix
> > > > tag is not a must.
> > > > > 
> > > > > Thanks,
> > > > > -Siwei
> > > > > 
> > > > > On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
> > > > > > > From: Zhu Lingshan<lingshan.zhu@intel.com>
> > > > > > > Sent: Friday, July 1, 2022 9:28 AM
> > > > > > > 
> > > > > > > Users may want to query the config space of a vDPA
> > > > > > > device, to choose a
> > > > > > > appropriate one for a certain guest. This means the
> > > > > > > users need to read the
> > > > > > > config space before FEATURES_OK, and the existence of config space
> > > > > > > contents does not depend on FEATURES_OK.
> > > > > > > 
> > > > > > > The spec says:
> > > > > > > The device MUST allow reading of any device-specific
> > > > > > > configuration field
> > > > > > > before FEATURES_OK is set by the driver. This
> > > > > > > includes fields which are
> > > > > > > conditional on feature bits, as long as those
> > > > > > > feature bits are offered by the
> > > > > > > device.
> > > > > > > 
> > > > > > > Fixes: 30ef7a8ac8a07 (vdpa: Read device
> > > > > > > configuration only if FEATURES_OK)
> > > > > > Fix is fine, but fixes tag needs correction described below.
> > > > > > 
> > > > > > Above commit id is 13 letters should be 12.
> > > > > > And
> > > > > > It should be in format
> > > > > > Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration
> > > > > > only if FEATURES_OK")
> > > > > > 
> > > > > > Please use checkpatch.pl script before posting the
> > > > > > patches to catch these errors.
> > > > > > There is a bot that looks at the fixes tag and
> > > > > > identifies the right kernel version to apply this fix.
> > > > > > 
> > > > > > > Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
> > > > > > > ---
> > > > > > >   drivers/vdpa/vdpa.c | 8 --------
> > > > > > >   1 file changed, 8 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> > > > > > > 9b0e39b2f022..d76b22b2f7ae 100644
> > > > > > > --- a/drivers/vdpa/vdpa.c
> > > > > > > +++ b/drivers/vdpa/vdpa.c
> > > > > > > @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
> > > > > > > struct sk_buff *msg, u32 portid,  {
> > > > > > >       u32 device_id;
> > > > > > >       void *hdr;
> > > > > > > -    u8 status;
> > > > > > >       int err;
> > > > > > > 
> > > > > > >       down_read(&vdev->cf_lock);
> > > > > > > -    status = vdev->config->get_status(vdev);
> > > > > > > -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> > > > > > > -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> > > > > > > completed");
> > > > > > > -        err = -EAGAIN;
> > > > > > > -        goto out;
> > > > > > > -    }
> > > > > > > -
> > > > > > >       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
> > > > > > >                 VDPA_CMD_DEV_CONFIG_GET);
> > > > > > >       if (!hdr) {
> > > > > > > -- 
> > > > > > > 2.31.1
> > > > > > _______________________________________________
> > > > > > Virtualization mailing list
> > > > > > Virtualization@lists.linux-foundation.org
> > > > > > https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$
> > > > > 
> > > > > 
> > > > 
> > > 
> >
Si-Wei Liu July 28, 2022, 10:12 p.m. UTC | #11
On 7/28/2022 4:35 AM, Michael S. Tsirkin wrote:
> On Thu, Jul 28, 2022 at 12:08:53AM -0700, Si-Wei Liu wrote:
>>
>> On 7/27/2022 7:06 PM, Jason Wang wrote:
>>> 在 2022/7/28 08:56, Si-Wei Liu 写道:
>>>>
>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>
>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>> Sorry to chime in late in the game. For some reason I
>>>>>> couldn't get to most emails for this discussion (I only
>>>>>> subscribed to the virtualization list), while I was taking
>>>>>> off amongst the past few weeks.
>>>>>>
>>>>>> It looks to me this patch is incomplete. Noted down the way
>>>>>> in vdpa_dev_net_config_fill(), we have the following:
>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>                                 VDPA_ATTR_PAD))
>>>>>>                   return -EMSGSIZE;
>>>>>>
>>>>>> Making call to .get_driver_features() doesn't make sense
>>>>>> when feature negotiation isn't complete. Neither should
>>>>>> present negotiated_features to userspace before negotiation
>>>>>> is done.
>>>>>>
>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>> probably should not show before negotiation is done - it
>>>>>> depends on driver features negotiated.
>>>>> I have another patch in this series introduces device_features
>>>>> and will report device_features to the userspace even features
>>>>> negotiation not done. Because the spec says we should allow
>>>>> driver access the config space before FEATURES_OK.
>>>> The config space can be accessed by guest before features_ok doesn't
>>>> necessarily mean the value is valid.
>>>
>>> It's valid as long as the device offers the feature:
>>>
>>> "The device MUST allow reading of any device-specific configuration
>>> field before FEATURES_OK is set by the driver. This includes fields
>>> which are conditional on feature bits, as long as those feature bits are
>>> offered by the device."
>> I guess this statement only conveys that the field in config space can be
>> read before FEATURES_OK is set, though it does not *explicitly* states the
>> validity of field.
>>
>> And looking at:
>>
>> "The mac address field always exists (though is only valid if
>> VIRTIO_NET_F_MAC is set), and status only exists if VIRTIO_NET_F_STATUS is
>> set."
>>
>> It appears to me there's a border line set between "exist" and "valid". If I
>> understand the spec wording correctly, a spec-conforming device
>> implementation may or may not offer valid status value in the config space
>> when VIRTIO_NET_F_STATUS is offered, but before the feature is negotiated.
>> On the other hand, config space should contain valid mac address the moment
>> VIRTIO_NET_F_MAC feature is offered, regardless being negotiated or not. By
>> that, there seems to be leeway for the device implementation to decide when
>> config space field may become valid, though for most of QEMU's software
>> virtio devices, valid value is present to config space the very first moment
>> when feature is offered.
>>
>> "If the VIRTIO_NET_F_MAC feature bit is set, the configuration space mac
>> entry indicates the “physical” address of the network card, otherwise the
>> driver would typically generate a random local MAC address."
>> "If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link status comes
>> from the bottom bit of status. Otherwise, the driver assumes it’s active."
>>
>> And also there are special cases where the read of specific configuration
>> space field MUST be deferred to until FEATURES_OK is set:
>>
>> "If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache mode can be
>> read or set through the writeback field. 0 corresponds to a writethrough
>> cache, 1 to a writeback cache11. The cache mode after reset can be either
>> writeback or writethrough. The actual mode can be determined by reading
>> writeback after feature negotiation."
>> "The driver MUST NOT read writeback before setting the FEATURES_OK device
>> status bit."
>> "If VIRTIO_BLK_F_CONFIG_WCE is negotiated but VIRTIO_BLK_F_FLUSH is not, the
>> device MUST initialize writeback to 0."
>>
>> Since the spec doesn't explicitly mandate the validity of each config space
>> field when feature of concern is offered, to be safer we'd have to live with
>> odd device implementation. I know for sure QEMU software devices won't for
>> 99% of these cases, but that's not what is currently defined in the spec.
>
> Thanks for raising this subject. I started working on this in April:
>
> https://urldefense.com/v3/__https://lists.oasis-open.org/archives/virtio-comment/202201/msg00068.html__;!!ACWV5N9M2RV99hQ!Os6QE_RJokx7Us9y7-5-ByVVLuy3oLuPodAdZWxwJw_aNkJY0o0H7691FI9aYSTRLVieASUD_eOu$
>
> working now to address the comments.
Great, thank you very much!

Is there a link to the latest draft that reflects the change uptodate? 
The one above with iterative feature negotiation seemed getting some 
resistance because of backward incompatibility with older spec? Please 
copy me in the loop when you have the draft ready. I am not in the 
virtio-comment list.

Thanks,
-Siwei
>
>
>>>
>>>> You may want to double check with Michael for what he quoted earlier:
>>>>> Nope:
>>>>>
>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>
>>>>> ...
>>>>>
>>>>> For optional configuration space fields, the driver MUST check
>>>>> that the corresponding feature is offered
>>>>> before accessing that part of the configuration space.
>>>> and how many driver bugs taking wrong assumption of the validity of
>>>> config space field without features_ok. I am not sure what use case
>>>> you want to expose config resister values for before features_ok, if
>>>> it's mostly for live migration I guess it's probably heading a wrong
>>>> direction.
>>>
>>> I guess it's not for migration.
>> Then what's the other possible use case than live migration, were to expose
>> config space values? Troubleshooting config space discrepancy between vDPA
>> and the emulated virtio device in userspace? Or tracking changes in config
>> space across feature negotiation, but for what? It'd be beneficial to the
>> interface design if the specific use case can be clearly described...
>>
>>
>>> For migration, a provision with the correct features/capability would be
>>> sufficient.
>> Right, that's what I thought too. It doesn't need to expose config space
>> values, simply exporting all attributes for vdpa device creation will do the
>> work.
>>
>> -Siwei
>>
>>> Thanks
>>>
>>>
>>>>
>>>>>>
>>>>>> Last but not the least, this "vdpa dev config" command was
>>>>>> not designed to display the real config space register
>>>>>> values in the first place. Quoting the vdpa-dev(8) man page:
>>>>>>
>>>>>>> vdpa dev config show - Show configuration of specific
>>>>>>> device or all devices.
>>>>>>> DEV - specifies the vdpa device to show its
>>>>>>> configuration. If this argument is omitted all devices
>>>>>>> configuration is listed.
>>>>>> It doesn't say anything about configuration space or
>>>>>> register values in config space. As long as it can convey
>>>>>> the config attribute when instantiating vDPA device
>>>>>> instance, and more importantly, the config can be easily
>>>>>> imported from or exported to userspace tools when trying to
>>>>>> reconstruct vdpa instance intact on destination host for
>>>>>> live migration, IMHO in my personal interpretation it
>>>>>> doesn't matter what the config space may present. It may be
>>>>>> worth while adding a new debug command to expose the real
>>>>>> register value, but that's another story.
>>>>> I am not sure getting your points. vDPA now reports device
>>>>> feature bits(device_features) and negotiated feature
>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>> subset of the device features; and the vDPA device features can
>>>>> be a subset of the management device features.
>>>> What I said is after unblocking the conditional check, you'd have to
>>>> handle the case for each of the vdpa attribute when feature
>>>> negotiation is not yet done: basically the register values you got
>>>> from config space via the vdpa_get_config_unlocked() call is not
>>>> considered to be valid before features_ok (per-spec). Although in
>>>> some case you may get sane value, such behavior is generally
>>>> undefined. If you desire to show just the device_features alone
>>>> without any config space field, which the device had advertised
>>>> *before feature negotiation is complete*, that'll be fine. But looks
>>>> to me this is not how patch has been implemented. Probably need some
>>>> more work?
>>>>
>>>> Regards,
>>>> -Siwei
>>>>
>>>>>> Having said, please consider to drop the Fixes tag, as
>>>>>> appears to me you're proposing a new feature rather than
>>>>>> fixing a real issue.
>>>>> it's a new feature to report the device feature bits than only
>>>>> negotiated features, however this patch is a must, or it will
>>>>> block the device feature bits reporting. but I agree, the fix
>>>>> tag is not a must.
>>>>>> Thanks,
>>>>>> -Siwei
>>>>>>
>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>
>>>>>>>> Users may want to query the config space of a vDPA
>>>>>>>> device, to choose a
>>>>>>>> appropriate one for a certain guest. This means the
>>>>>>>> users need to read the
>>>>>>>> config space before FEATURES_OK, and the existence of config space
>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>
>>>>>>>> The spec says:
>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>> configuration field
>>>>>>>> before FEATURES_OK is set by the driver. This
>>>>>>>> includes fields which are
>>>>>>>> conditional on feature bits, as long as those
>>>>>>>> feature bits are offered by the
>>>>>>>> device.
>>>>>>>>
>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device
>>>>>>>> configuration only if FEATURES_OK)
>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>
>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>> And
>>>>>>> It should be in format
>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration
>>>>>>> only if FEATURES_OK")
>>>>>>>
>>>>>>> Please use checkpatch.pl script before posting the
>>>>>>> patches to catch these errors.
>>>>>>> There is a bot that looks at the fixes tag and
>>>>>>> identifies the right kernel version to apply this fix.
>>>>>>>
>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> ---
>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>        u32 device_id;
>>>>>>>>        void *hdr;
>>>>>>>> -    u8 status;
>>>>>>>>        int err;
>>>>>>>>
>>>>>>>>        down_read(&vdev->cf_lock);
>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>> completed");
>>>>>>>> -        err = -EAGAIN;
>>>>>>>> -        goto out;
>>>>>>>> -    }
>>>>>>>> -
>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>>>                  VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>        if (!hdr) {
>>>>>>>> -- 
>>>>>>>> 2.31.1
>>>>>>> _______________________________________________
>>>>>>> Virtualization mailing list
>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!Pkwym7OAjoDucUqs2fAwchxqL8-BGd6wOl-51xcgB_yCNwPJ_cs8A1y-cYmrLTB4OBNsimnZuqJPcvQIl3g$
>>>>>>
Si-Wei Liu July 29, 2022, 8:55 p.m. UTC | #12
On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>
>
> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>
>>
>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>
>>>>
>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>
>>>>>
>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>> Sorry to chime in late in the game. For some reason I couldn't 
>>>>>> get to most emails for this discussion (I only subscribed to the 
>>>>>> virtualization list), while I was taking off amongst the past few 
>>>>>> weeks.
>>>>>>
>>>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>          features = vdev->config->get_driver_features(vdev);
>>>>>>          if (nla_put_u64_64bit(msg, VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>                                VDPA_ATTR_PAD))
>>>>>>                  return -EMSGSIZE;
>>>>>>
>>>>>> Making call to .get_driver_features() doesn't make sense when 
>>>>>> feature negotiation isn't complete. Neither should present 
>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>
>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() probably 
>>>>>> should not show before negotiation is done - it depends on driver 
>>>>>> features negotiated.
>>>>> I have another patch in this series introduces device_features and 
>>>>> will report device_features to the userspace even features 
>>>>> negotiation not done. Because the spec says we should allow driver 
>>>>> access the config space before FEATURES_OK.
>>>> The config space can be accessed by guest before features_ok 
>>>> doesn't necessarily mean the value is valid. You may want to double 
>>>> check with Michael for what he quoted earlier:
>>> that's why I proposed to fix these issues, e.g., if no _F_MAC, vDPA 
>>> kernel should not return a mac to the userspace, there is not a 
>>> default value for mac.
>> Then please show us the code, as I can only comment based on your 
>> latest (v4) patch and it was not there.. To be honest, I don't 
>> understand the motivation and the use cases you have, is it for 
>> debugging/monitoring or there's really a use case for live migration? 
>> For the former, you can do a direct dump on all config space fields 
>> regardless of endianess and feature negotiation without having to 
>> worry about validity (meaningful to present to admin user). To me 
>> these are conflict asks that is impossible to mix in exact one command.
> This bug just has been revealed two days, and you will see the patch soon.
>
> There are something to clarify:
> 1) we need to read the device features, or how can you pick a proper 
> LM destination
> 2) vdpa dev config show can show both device features and driver 
> features, there just need a patch for iproute2
> 3) To process information like MQ, we don't just dump the config 
> space, MST has explained before
So, it's for live migration... Then why not export those config 
parameters specified for vdpa creation (as well as device feature bits) 
to the output of "vdpa dev show" command? That's where device side 
config lives and is static across vdpa's life cycle. "vdpa dev config 
show" is mostly for dynamic driver side config, and the validity is 
subject to feature negotiation. I suppose this should suit your need of 
LM, e.g.

$ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
$ vdpa dev show vdpa1
vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs 15 
max_vq_size 256
   max_vqp 7 mtu 2000
   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ 
MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED

For it to work, you'd want to pass "struct vdpa_dev_set_config" to 
_vdpa_register_device() during registration, and get it saved there in 
"struct vdpa_device". Then in vdpa_dev_fill() show each field 
conditionally subject to "struct vdpa_dev_set_config.mask".

Thanks,
-Siwei
>
> Thanks
> Zhu Lingshan
>>
>>>>> Nope:
>>>>>
>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>
>>>>> ...
>>>>>
>>>>> For optional configuration space fields, the driver MUST check that the corresponding feature is offered
>>>>> before accessing that part of the configuration space.
>>>>
>>>> and how many driver bugs taking wrong assumption of the validity of 
>>>> config space field without features_ok. I am not sure what use case 
>>>> you want to expose config resister values for before features_ok, 
>>>> if it's mostly for live migration I guess it's probably heading a 
>>>> wrong direction.
>>>>
>>>>
>>>>>>
>>>>>>
>>>>>> Last but not the least, this "vdpa dev config" command was not 
>>>>>> designed to display the real config space register values in the 
>>>>>> first place. Quoting the vdpa-dev(8) man page:
>>>>>>
>>>>>>> vdpa dev config show - Show configuration of specific device or 
>>>>>>> all devices.
>>>>>>> DEV - specifies the vdpa device to show its configuration. If 
>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>> It doesn't say anything about configuration space or register 
>>>>>> values in config space. As long as it can convey the config 
>>>>>> attribute when instantiating vDPA device instance, and more 
>>>>>> importantly, the config can be easily imported from or exported 
>>>>>> to userspace tools when trying to reconstruct vdpa instance 
>>>>>> intact on destination host for live migration, IMHO in my 
>>>>>> personal interpretation it doesn't matter what the config space 
>>>>>> may present. It may be worth while adding a new debug command to 
>>>>>> expose the real register value, but that's another story.
>>>>> I am not sure getting your points. vDPA now reports device feature 
>>>>> bits(device_features) and negotiated feature 
>>>>> bits(driver_features), and yes, the drivers features can be a 
>>>>> subset of the device features; and the vDPA device features can be 
>>>>> a subset of the management device features.
>>>> What I said is after unblocking the conditional check, you'd have 
>>>> to handle the case for each of the vdpa attribute when feature 
>>>> negotiation is not yet done: basically the register values you got 
>>>> from config space via the vdpa_get_config_unlocked() call is not 
>>>> considered to be valid before features_ok (per-spec). Although in 
>>>> some case you may get sane value, such behavior is generally 
>>>> undefined. If you desire to show just the device_features alone 
>>>> without any config space field, which the device had advertised 
>>>> *before feature negotiation is complete*, that'll be fine. But 
>>>> looks to me this is not how patch has been implemented. Probably 
>>>> need some more work?
>>> They are driver_features(negotiated) and the device_features(which 
>>> comes with the device), and the config space fields that depend on 
>>> them. In this series, we report both to the userspace.
>> I fail to understand what you want to present from your description. 
>> May be worth showing some example outputs that at least include the 
>> following cases: 1) when device offers features but not yet 
>> acknowledge by guest 2) when guest acknowledged features and device 
>> is yet to accept 3) after guest feature negotiation is completed 
>> (agreed upon between guest and device).
> Only two feature sets: 1) what the device has. (2) what is negotiated
>>
>> Thanks,
>> -Siwei
>>>>
>>>> Regards,
>>>> -Siwei
>>>>
>>>>>>
>>>>>> Having said, please consider to drop the Fixes tag, as appears to 
>>>>>> me you're proposing a new feature rather than fixing a real issue.
>>>>> it's a new feature to report the device feature bits than only 
>>>>> negotiated features, however this patch is a must, or it will 
>>>>> block the device feature bits reporting. but I agree, the fix tag 
>>>>> is not a must.
>>>>>>
>>>>>> Thanks,
>>>>>> -Siwei
>>>>>>
>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>
>>>>>>>> Users may want to query the config space of a vDPA device, to choose a
>>>>>>>> appropriate one for a certain guest. This means the users need to read the
>>>>>>>> config space before FEATURES_OK, and the existence of config space
>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>
>>>>>>>> The spec says:
>>>>>>>> The device MUST allow reading of any device-specific configuration field
>>>>>>>> before FEATURES_OK is set by the driver. This includes fields which are
>>>>>>>> conditional on feature bits, as long as those feature bits are offered by the
>>>>>>>> device.
>>>>>>>>
>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if FEATURES_OK)
>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>
>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>> And
>>>>>>> It should be in format
>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if FEATURES_OK")
>>>>>>>
>>>>>>> Please use checkpatch.pl script before posting the patches to catch these errors.
>>>>>>> There is a bot that looks at the fixes tag and identifies the right kernel version to apply this fix.
>>>>>>>
>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>> ---
>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>   1 file changed, 8 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device *vdev,
>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>   	u32 device_id;
>>>>>>>>   	void *hdr;
>>>>>>>> -	u8 status;
>>>>>>>>   	int err;
>>>>>>>>
>>>>>>>>   	down_read(&vdev->cf_lock);
>>>>>>>> -	status = vdev->config->get_status(vdev);
>>>>>>>> -	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>> -		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>> completed");
>>>>>>>> -		err = -EAGAIN;
>>>>>>>> -		goto out;
>>>>>>>> -	}
>>>>>>>> -
>>>>>>>>   	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>>>   			  VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>   	if (!hdr) {
>>>>>>>> --
>>>>>>>> 2.31.1
>>>>>>> _______________________________________________
>>>>>>> Virtualization mailing list
>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>>>>>>
>>>>>
>>>>
>>>
>>
>
Jason Wang Aug. 1, 2022, 4:44 a.m. UTC | #13
在 2022/7/30 04:55, Si-Wei Liu 写道:
>
>
> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>
>>
>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>
>>>
>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>
>>>>
>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>
>>>>>
>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>
>>>>>>
>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>> Sorry to chime in late in the game. For some reason I couldn't 
>>>>>>> get to most emails for this discussion (I only subscribed to the 
>>>>>>> virtualization list), while I was taking off amongst the past 
>>>>>>> few weeks.
>>>>>>>
>>>>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>          features = vdev->config->get_driver_features(vdev);
>>>>>>>          if (nla_put_u64_64bit(msg, 
>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>                                VDPA_ATTR_PAD))
>>>>>>>                  return -EMSGSIZE;
>>>>>>>
>>>>>>> Making call to .get_driver_features() doesn't make sense when 
>>>>>>> feature negotiation isn't complete. Neither should present 
>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>
>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() 
>>>>>>> probably should not show before negotiation is done - it depends 
>>>>>>> on driver features negotiated.
>>>>>> I have another patch in this series introduces device_features 
>>>>>> and will report device_features to the userspace even features 
>>>>>> negotiation not done. Because the spec says we should allow 
>>>>>> driver access the config space before FEATURES_OK.
>>>>> The config space can be accessed by guest before features_ok 
>>>>> doesn't necessarily mean the value is valid. You may want to 
>>>>> double check with Michael for what he quoted earlier:
>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC, vDPA 
>>>> kernel should not return a mac to the userspace, there is not a 
>>>> default value for mac.
>>> Then please show us the code, as I can only comment based on your 
>>> latest (v4) patch and it was not there.. To be honest, I don't 
>>> understand the motivation and the use cases you have, is it for 
>>> debugging/monitoring or there's really a use case for live 
>>> migration? For the former, you can do a direct dump on all config 
>>> space fields regardless of endianess and feature negotiation without 
>>> having to worry about validity (meaningful to present to admin 
>>> user). To me these are conflict asks that is impossible to mix in 
>>> exact one command.
>> This bug just has been revealed two days, and you will see the patch 
>> soon.
>>
>> There are something to clarify:
>> 1) we need to read the device features, or how can you pick a proper 
>> LM destination


So it's probably not very efficient to use this, the manager layer 
should have the knowledge about the compatibility before doing migration 
other than try-and-fail.

And it's the task of the management to gather the nodes whose devices 
could be live migrated to each other as something like "cluster" which 
we've already used in the case of cpuflags.

1) during node bootstrap, the capability of each node and devices was 
reported to management layer
2) management layer decide the cluster and make sure the migration can 
only done among the nodes insides the cluster
3) before migration, the vDPA needs to be provisioned on the destination


>> 2) vdpa dev config show can show both device features and driver 
>> features, there just need a patch for iproute2
>> 3) To process information like MQ, we don't just dump the config 
>> space, MST has explained before
> So, it's for live migration... Then why not export those config 
> parameters specified for vdpa creation (as well as device feature 
> bits) to the output of "vdpa dev show" command? That's where device 
> side config lives and is static across vdpa's life cycle. "vdpa dev 
> config show" is mostly for dynamic driver side config, and the 
> validity is subject to feature negotiation. I suppose this should suit 
> your need of LM, e.g.


I think so.


>
> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
> $ vdpa dev show vdpa1
> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs 15 
> max_vq_size 256
>   max_vqp 7 mtu 2000
>   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ 
> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED


Note that the mgmt should know this destination have those 
capability/features before the provisioning.


>
> For it to work, you'd want to pass "struct vdpa_dev_set_config" to 
> _vdpa_register_device() during registration, and get it saved there in 
> "struct vdpa_device". Then in vdpa_dev_fill() show each field 
> conditionally subject to "struct vdpa_dev_set_config.mask".
>
> Thanks,
> -Siwei


Thanks


>>
>> Thanks
>> Zhu Lingshan
>>>
>>>>>> Nope:
>>>>>>
>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> For optional configuration space fields, the driver MUST check 
>>>>>> that the corresponding feature is offered
>>>>>> before accessing that part of the configuration space.
>>>>>
>>>>> and how many driver bugs taking wrong assumption of the validity 
>>>>> of config space field without features_ok. I am not sure what use 
>>>>> case you want to expose config resister values for before 
>>>>> features_ok, if it's mostly for live migration I guess it's 
>>>>> probably heading a wrong direction.
>>>>>
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Last but not the least, this "vdpa dev config" command was not 
>>>>>>> designed to display the real config space register values in the 
>>>>>>> first place. Quoting the vdpa-dev(8) man page:
>>>>>>>
>>>>>>>> vdpa dev config show - Show configuration of specific device or 
>>>>>>>> all devices.
>>>>>>>> DEV - specifies the vdpa device to show its configuration. If 
>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>> It doesn't say anything about configuration space or register 
>>>>>>> values in config space. As long as it can convey the config 
>>>>>>> attribute when instantiating vDPA device instance, and more 
>>>>>>> importantly, the config can be easily imported from or exported 
>>>>>>> to userspace tools when trying to reconstruct vdpa instance 
>>>>>>> intact on destination host for live migration, IMHO in my 
>>>>>>> personal interpretation it doesn't matter what the config space 
>>>>>>> may present. It may be worth while adding a new debug command to 
>>>>>>> expose the real register value, but that's another story.
>>>>>> I am not sure getting your points. vDPA now reports device 
>>>>>> feature bits(device_features) and negotiated feature 
>>>>>> bits(driver_features), and yes, the drivers features can be a 
>>>>>> subset of the device features; and the vDPA device features can 
>>>>>> be a subset of the management device features.
>>>>> What I said is after unblocking the conditional check, you'd have 
>>>>> to handle the case for each of the vdpa attribute when feature 
>>>>> negotiation is not yet done: basically the register values you got 
>>>>> from config space via the vdpa_get_config_unlocked() call is not 
>>>>> considered to be valid before features_ok (per-spec). Although in 
>>>>> some case you may get sane value, such behavior is generally 
>>>>> undefined. If you desire to show just the device_features alone 
>>>>> without any config space field, which the device had advertised 
>>>>> *before feature negotiation is complete*, that'll be fine. But 
>>>>> looks to me this is not how patch has been implemented. Probably 
>>>>> need some more work?
>>>> They are driver_features(negotiated) and the device_features(which 
>>>> comes with the device), and the config space fields that depend on 
>>>> them. In this series, we report both to the userspace.
>>> I fail to understand what you want to present from your description. 
>>> May be worth showing some example outputs that at least include the 
>>> following cases: 1) when device offers features but not yet 
>>> acknowledge by guest 2) when guest acknowledged features and device 
>>> is yet to accept 3) after guest feature negotiation is completed 
>>> (agreed upon between guest and device).
>> Only two feature sets: 1) what the device has. (2) what is negotiated
>>>
>>> Thanks,
>>> -Siwei
>>>>>
>>>>> Regards,
>>>>> -Siwei
>>>>>
>>>>>>>
>>>>>>> Having said, please consider to drop the Fixes tag, as appears 
>>>>>>> to me you're proposing a new feature rather than fixing a real 
>>>>>>> issue.
>>>>>> it's a new feature to report the device feature bits than only 
>>>>>> negotiated features, however this patch is a must, or it will 
>>>>>> block the device feature bits reporting. but I agree, the fix tag 
>>>>>> is not a must.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -Siwei
>>>>>>>
>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>
>>>>>>>>> Users may want to query the config space of a vDPA device, to 
>>>>>>>>> choose a
>>>>>>>>> appropriate one for a certain guest. This means the users need 
>>>>>>>>> to read the
>>>>>>>>> config space before FEATURES_OK, and the existence of config 
>>>>>>>>> space
>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>
>>>>>>>>> The spec says:
>>>>>>>>> The device MUST allow reading of any device-specific 
>>>>>>>>> configuration field
>>>>>>>>> before FEATURES_OK is set by the driver. This includes fields 
>>>>>>>>> which are
>>>>>>>>> conditional on feature bits, as long as those feature bits are 
>>>>>>>>> offered by the
>>>>>>>>> device.
>>>>>>>>>
>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if 
>>>>>>>>> FEATURES_OK)
>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>
>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>> And
>>>>>>>> It should be in format
>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if 
>>>>>>>> FEATURES_OK")
>>>>>>>>
>>>>>>>> Please use checkpatch.pl script before posting the patches to 
>>>>>>>> catch these errors.
>>>>>>>> There is a bot that looks at the fixes tag and identifies the 
>>>>>>>> right kernel version to apply this fix.
>>>>>>>>
>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>> ---
>>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>   1 file changed, 8 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device 
>>>>>>>>> *vdev,
>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>       u32 device_id;
>>>>>>>>>       void *hdr;
>>>>>>>>> -    u8 status;
>>>>>>>>>       int err;
>>>>>>>>>
>>>>>>>>>       down_read(&vdev->cf_lock);
>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>> completed");
>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>> -        goto out;
>>>>>>>>> -    }
>>>>>>>>> -
>>>>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
>>>>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>       if (!hdr) {
>>>>>>>>> -- 
>>>>>>>>> 2.31.1
>>>>>>>> _______________________________________________
>>>>>>>> Virtualization mailing list
>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Si-Wei Liu Aug. 1, 2022, 10:53 p.m. UTC | #14
On 7/31/2022 9:44 PM, Jason Wang wrote:
>
> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>
>>
>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>
>>>>
>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>
>>>>>
>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>
>>>>>>
>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't 
>>>>>>>> get to most emails for this discussion (I only subscribed to 
>>>>>>>> the virtualization list), while I was taking off amongst the 
>>>>>>>> past few weeks.
>>>>>>>>
>>>>>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>          features = vdev->config->get_driver_features(vdev);
>>>>>>>>          if (nla_put_u64_64bit(msg, 
>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>                                VDPA_ATTR_PAD))
>>>>>>>>                  return -EMSGSIZE;
>>>>>>>>
>>>>>>>> Making call to .get_driver_features() doesn't make sense when 
>>>>>>>> feature negotiation isn't complete. Neither should present 
>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>
>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() 
>>>>>>>> probably should not show before negotiation is done - it 
>>>>>>>> depends on driver features negotiated.
>>>>>>> I have another patch in this series introduces device_features 
>>>>>>> and will report device_features to the userspace even features 
>>>>>>> negotiation not done. Because the spec says we should allow 
>>>>>>> driver access the config space before FEATURES_OK.
>>>>>> The config space can be accessed by guest before features_ok 
>>>>>> doesn't necessarily mean the value is valid. You may want to 
>>>>>> double check with Michael for what he quoted earlier:
>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC, 
>>>>> vDPA kernel should not return a mac to the userspace, there is not 
>>>>> a default value for mac.
>>>> Then please show us the code, as I can only comment based on your 
>>>> latest (v4) patch and it was not there.. To be honest, I don't 
>>>> understand the motivation and the use cases you have, is it for 
>>>> debugging/monitoring or there's really a use case for live 
>>>> migration? For the former, you can do a direct dump on all config 
>>>> space fields regardless of endianess and feature negotiation 
>>>> without having to worry about validity (meaningful to present to 
>>>> admin user). To me these are conflict asks that is impossible to 
>>>> mix in exact one command.
>>> This bug just has been revealed two days, and you will see the patch 
>>> soon.
>>>
>>> There are something to clarify:
>>> 1) we need to read the device features, or how can you pick a proper 
>>> LM destination
>
>
> So it's probably not very efficient to use this, the manager layer 
> should have the knowledge about the compatibility before doing 
> migration other than try-and-fail.
>
> And it's the task of the management to gather the nodes whose devices 
> could be live migrated to each other as something like "cluster" which 
> we've already used in the case of cpuflags.
>
> 1) during node bootstrap, the capability of each node and devices was 
> reported to management layer
> 2) management layer decide the cluster and make sure the migration can 
> only done among the nodes insides the cluster
> 3) before migration, the vDPA needs to be provisioned on the destination
>
>
>>> 2) vdpa dev config show can show both device features and driver 
>>> features, there just need a patch for iproute2
>>> 3) To process information like MQ, we don't just dump the config 
>>> space, MST has explained before
>> So, it's for live migration... Then why not export those config 
>> parameters specified for vdpa creation (as well as device feature 
>> bits) to the output of "vdpa dev show" command? That's where device 
>> side config lives and is static across vdpa's life cycle. "vdpa dev 
>> config show" is mostly for dynamic driver side config, and the 
>> validity is subject to feature negotiation. I suppose this should 
>> suit your need of LM, e.g.
>
>
> I think so.
>
>
>>
>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
>> $ vdpa dev show vdpa1
>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs 
>> 15 max_vq_size 256
>>   max_vqp 7 mtu 2000
>>   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ 
>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>
>
> Note that the mgmt should know this destination have those 
> capability/features before the provisioning.
Yes, mgmt software should have to check the above from source.

>
>
>>
>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to 
>> _vdpa_register_device() during registration, and get it saved there 
>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field 
>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>
>> Thanks,
>> -Siwei
>
>
> Thanks
>
>
>>>
>>> Thanks
>>> Zhu Lingshan
>>>>
>>>>>>> Nope:
>>>>>>>
>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>> For optional configuration space fields, the driver MUST check 
>>>>>>> that the corresponding feature is offered
>>>>>>> before accessing that part of the configuration space.
>>>>>>
>>>>>> and how many driver bugs taking wrong assumption of the validity 
>>>>>> of config space field without features_ok. I am not sure what use 
>>>>>> case you want to expose config resister values for before 
>>>>>> features_ok, if it's mostly for live migration I guess it's 
>>>>>> probably heading a wrong direction.
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Last but not the least, this "vdpa dev config" command was not 
>>>>>>>> designed to display the real config space register values in 
>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>
>>>>>>>>> vdpa dev config show - Show configuration of specific device 
>>>>>>>>> or all devices.
>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If 
>>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>>> It doesn't say anything about configuration space or register 
>>>>>>>> values in config space. As long as it can convey the config 
>>>>>>>> attribute when instantiating vDPA device instance, and more 
>>>>>>>> importantly, the config can be easily imported from or exported 
>>>>>>>> to userspace tools when trying to reconstruct vdpa instance 
>>>>>>>> intact on destination host for live migration, IMHO in my 
>>>>>>>> personal interpretation it doesn't matter what the config space 
>>>>>>>> may present. It may be worth while adding a new debug command 
>>>>>>>> to expose the real register value, but that's another story.
>>>>>>> I am not sure getting your points. vDPA now reports device 
>>>>>>> feature bits(device_features) and negotiated feature 
>>>>>>> bits(driver_features), and yes, the drivers features can be a 
>>>>>>> subset of the device features; and the vDPA device features can 
>>>>>>> be a subset of the management device features.
>>>>>> What I said is after unblocking the conditional check, you'd have 
>>>>>> to handle the case for each of the vdpa attribute when feature 
>>>>>> negotiation is not yet done: basically the register values you 
>>>>>> got from config space via the vdpa_get_config_unlocked() call is 
>>>>>> not considered to be valid before features_ok (per-spec). 
>>>>>> Although in some case you may get sane value, such behavior is 
>>>>>> generally undefined. If you desire to show just the 
>>>>>> device_features alone without any config space field, which the 
>>>>>> device had advertised *before feature negotiation is complete*, 
>>>>>> that'll be fine. But looks to me this is not how patch has been 
>>>>>> implemented. Probably need some more work?
>>>>> They are driver_features(negotiated) and the device_features(which 
>>>>> comes with the device), and the config space fields that depend on 
>>>>> them. In this series, we report both to the userspace.
>>>> I fail to understand what you want to present from your 
>>>> description. May be worth showing some example outputs that at 
>>>> least include the following cases: 1) when device offers features 
>>>> but not yet acknowledge by guest 2) when guest acknowledged 
>>>> features and device is yet to accept 3) after guest feature 
>>>> negotiation is completed (agreed upon between guest and device).
>>> Only two feature sets: 1) what the device has. (2) what is negotiated
>>>>
>>>> Thanks,
>>>> -Siwei
>>>>>>
>>>>>> Regards,
>>>>>> -Siwei
>>>>>>
>>>>>>>>
>>>>>>>> Having said, please consider to drop the Fixes tag, as appears 
>>>>>>>> to me you're proposing a new feature rather than fixing a real 
>>>>>>>> issue.
>>>>>>> it's a new feature to report the device feature bits than only 
>>>>>>> negotiated features, however this patch is a must, or it will 
>>>>>>> block the device feature bits reporting. but I agree, the fix 
>>>>>>> tag is not a must.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -Siwei
>>>>>>>>
>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>
>>>>>>>>>> Users may want to query the config space of a vDPA device, to 
>>>>>>>>>> choose a
>>>>>>>>>> appropriate one for a certain guest. This means the users 
>>>>>>>>>> need to read the
>>>>>>>>>> config space before FEATURES_OK, and the existence of config 
>>>>>>>>>> space
>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>
>>>>>>>>>> The spec says:
>>>>>>>>>> The device MUST allow reading of any device-specific 
>>>>>>>>>> configuration field
>>>>>>>>>> before FEATURES_OK is set by the driver. This includes fields 
>>>>>>>>>> which are
>>>>>>>>>> conditional on feature bits, as long as those feature bits 
>>>>>>>>>> are offered by the
>>>>>>>>>> device.
>>>>>>>>>>
>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only if 
>>>>>>>>>> FEATURES_OK)
>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>
>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>> And
>>>>>>>>> It should be in format
>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if 
>>>>>>>>> FEATURES_OK")
>>>>>>>>>
>>>>>>>>> Please use checkpatch.pl script before posting the patches to 
>>>>>>>>> catch these errors.
>>>>>>>>> There is a bot that looks at the fixes tag and identifies the 
>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>
>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>> ---
>>>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>   1 file changed, 8 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device 
>>>>>>>>>> *vdev,
>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>       u32 device_id;
>>>>>>>>>>       void *hdr;
>>>>>>>>>> -    u8 status;
>>>>>>>>>>       int err;
>>>>>>>>>>
>>>>>>>>>>       down_read(&vdev->cf_lock);
>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>> completed");
>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>> -        goto out;
>>>>>>>>>> -    }
>>>>>>>>>> -
>>>>>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, 
>>>>>>>>>> flags,
>>>>>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>       if (!hdr) {
>>>>>>>>>> -- 
>>>>>>>>>> 2.31.1
>>>>>>>>> _______________________________________________
>>>>>>>>> Virtualization mailing list
>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Si-Wei Liu Aug. 1, 2022, 10:58 p.m. UTC | #15
On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>
>
> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>
>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>
>>>
>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>
>>>>
>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>
>>>>>
>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>
>>>>>>
>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't 
>>>>>>>>> get to most emails for this discussion (I only subscribed to 
>>>>>>>>> the virtualization list), while I was taking off amongst the 
>>>>>>>>> past few weeks.
>>>>>>>>>
>>>>>>>>> It looks to me this patch is incomplete. Noted down the way in 
>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>          features = vdev->config->get_driver_features(vdev);
>>>>>>>>>          if (nla_put_u64_64bit(msg, 
>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>                                VDPA_ATTR_PAD))
>>>>>>>>>                  return -EMSGSIZE;
>>>>>>>>>
>>>>>>>>> Making call to .get_driver_features() doesn't make sense when 
>>>>>>>>> feature negotiation isn't complete. Neither should present 
>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>
>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill() 
>>>>>>>>> probably should not show before negotiation is done - it 
>>>>>>>>> depends on driver features negotiated.
>>>>>>>> I have another patch in this series introduces device_features 
>>>>>>>> and will report device_features to the userspace even features 
>>>>>>>> negotiation not done. Because the spec says we should allow 
>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>> The config space can be accessed by guest before features_ok 
>>>>>>> doesn't necessarily mean the value is valid. You may want to 
>>>>>>> double check with Michael for what he quoted earlier:
>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC, 
>>>>>> vDPA kernel should not return a mac to the userspace, there is 
>>>>>> not a default value for mac.
>>>>> Then please show us the code, as I can only comment based on your 
>>>>> latest (v4) patch and it was not there.. To be honest, I don't 
>>>>> understand the motivation and the use cases you have, is it for 
>>>>> debugging/monitoring or there's really a use case for live 
>>>>> migration? For the former, you can do a direct dump on all config 
>>>>> space fields regardless of endianess and feature negotiation 
>>>>> without having to worry about validity (meaningful to present to 
>>>>> admin user). To me these are conflict asks that is impossible to 
>>>>> mix in exact one command.
>>>> This bug just has been revealed two days, and you will see the 
>>>> patch soon.
>>>>
>>>> There are something to clarify:
>>>> 1) we need to read the device features, or how can you pick a 
>>>> proper LM destination
>>
>>
>> So it's probably not very efficient to use this, the manager layer 
>> should have the knowledge about the compatibility before doing 
>> migration other than try-and-fail.
>>
>> And it's the task of the management to gather the nodes whose devices 
>> could be live migrated to each other as something like "cluster" 
>> which we've already used in the case of cpuflags.
>>
>> 1) during node bootstrap, the capability of each node and devices was 
>> reported to management layer
>> 2) management layer decide the cluster and make sure the migration 
>> can only done among the nodes insides the cluster
>> 3) before migration, the vDPA needs to be provisioned on the destination
>>
>>
>>>> 2) vdpa dev config show can show both device features and driver 
>>>> features, there just need a patch for iproute2
>>>> 3) To process information like MQ, we don't just dump the config 
>>>> space, MST has explained before
>>> So, it's for live migration... Then why not export those config 
>>> parameters specified for vdpa creation (as well as device feature 
>>> bits) to the output of "vdpa dev show" command? That's where device 
>>> side config lives and is static across vdpa's life cycle. "vdpa dev 
>>> config show" is mostly for dynamic driver side config, and the 
>>> validity is subject to feature negotiation. I suppose this should 
>>> suit your need of LM, e.g.
>>
>>
>> I think so.
>>
>>
>>>
>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
>>> $ vdpa dev show vdpa1
>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs 
>>> 15 max_vq_size 256
>>>   max_vqp 7 mtu 2000
>>>   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS 
>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>
>>
>> Note that the mgmt should know this destination have those 
>> capability/features before the provisioning.
> Yes, mgmt software should have to check the above from source.

On destination mgmt software can run below to check vdpa mgmtdev's 
capability/features:

$ vdpa mgmtdev show pci/0000:41:04.3
pci/0000:41:04.3:
   supported_classes net
   max_supported_vqs 257
   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ 
MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>
>>
>>
>>>
>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to 
>>> _vdpa_register_device() during registration, and get it saved there 
>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field 
>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>
>>> Thanks,
>>> -Siwei
>>
>>
>> Thanks
>>
>>
>>>>
>>>> Thanks
>>>> Zhu Lingshan
>>>>>
>>>>>>>> Nope:
>>>>>>>>
>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>> For optional configuration space fields, the driver MUST check 
>>>>>>>> that the corresponding feature is offered
>>>>>>>> before accessing that part of the configuration space.
>>>>>>>
>>>>>>> and how many driver bugs taking wrong assumption of the validity 
>>>>>>> of config space field without features_ok. I am not sure what 
>>>>>>> use case you want to expose config resister values for before 
>>>>>>> features_ok, if it's mostly for live migration I guess it's 
>>>>>>> probably heading a wrong direction.
>>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Last but not the least, this "vdpa dev config" command was not 
>>>>>>>>> designed to display the real config space register values in 
>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>
>>>>>>>>>> vdpa dev config show - Show configuration of specific device 
>>>>>>>>>> or all devices.
>>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If 
>>>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>>>> It doesn't say anything about configuration space or register 
>>>>>>>>> values in config space. As long as it can convey the config 
>>>>>>>>> attribute when instantiating vDPA device instance, and more 
>>>>>>>>> importantly, the config can be easily imported from or 
>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa 
>>>>>>>>> instance intact on destination host for live migration, IMHO 
>>>>>>>>> in my personal interpretation it doesn't matter what the 
>>>>>>>>> config space may present. It may be worth while adding a new 
>>>>>>>>> debug command to expose the real register value, but that's 
>>>>>>>>> another story.
>>>>>>>> I am not sure getting your points. vDPA now reports device 
>>>>>>>> feature bits(device_features) and negotiated feature 
>>>>>>>> bits(driver_features), and yes, the drivers features can be a 
>>>>>>>> subset of the device features; and the vDPA device features can 
>>>>>>>> be a subset of the management device features.
>>>>>>> What I said is after unblocking the conditional check, you'd 
>>>>>>> have to handle the case for each of the vdpa attribute when 
>>>>>>> feature negotiation is not yet done: basically the register 
>>>>>>> values you got from config space via the 
>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid 
>>>>>>> before features_ok (per-spec). Although in some case you may get 
>>>>>>> sane value, such behavior is generally undefined. If you desire 
>>>>>>> to show just the device_features alone without any config space 
>>>>>>> field, which the device had advertised *before feature 
>>>>>>> negotiation is complete*, that'll be fine. But looks to me this 
>>>>>>> is not how patch has been implemented. Probably need some more 
>>>>>>> work?
>>>>>> They are driver_features(negotiated) and the 
>>>>>> device_features(which comes with the device), and the config 
>>>>>> space fields that depend on them. In this series, we report both 
>>>>>> to the userspace.
>>>>> I fail to understand what you want to present from your 
>>>>> description. May be worth showing some example outputs that at 
>>>>> least include the following cases: 1) when device offers features 
>>>>> but not yet acknowledge by guest 2) when guest acknowledged 
>>>>> features and device is yet to accept 3) after guest feature 
>>>>> negotiation is completed (agreed upon between guest and device).
>>>> Only two feature sets: 1) what the device has. (2) what is negotiated
>>>>>
>>>>> Thanks,
>>>>> -Siwei
>>>>>>>
>>>>>>> Regards,
>>>>>>> -Siwei
>>>>>>>
>>>>>>>>>
>>>>>>>>> Having said, please consider to drop the Fixes tag, as appears 
>>>>>>>>> to me you're proposing a new feature rather than fixing a real 
>>>>>>>>> issue.
>>>>>>>> it's a new feature to report the device feature bits than only 
>>>>>>>> negotiated features, however this patch is a must, or it will 
>>>>>>>> block the device feature bits reporting. but I agree, the fix 
>>>>>>>> tag is not a must.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -Siwei
>>>>>>>>>
>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>
>>>>>>>>>>> Users may want to query the config space of a vDPA device, 
>>>>>>>>>>> to choose a
>>>>>>>>>>> appropriate one for a certain guest. This means the users 
>>>>>>>>>>> need to read the
>>>>>>>>>>> config space before FEATURES_OK, and the existence of config 
>>>>>>>>>>> space
>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>
>>>>>>>>>>> The spec says:
>>>>>>>>>>> The device MUST allow reading of any device-specific 
>>>>>>>>>>> configuration field
>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes 
>>>>>>>>>>> fields which are
>>>>>>>>>>> conditional on feature bits, as long as those feature bits 
>>>>>>>>>>> are offered by the
>>>>>>>>>>> device.
>>>>>>>>>>>
>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only 
>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>>
>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>> And
>>>>>>>>>> It should be in format
>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if 
>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>
>>>>>>>>>> Please use checkpatch.pl script before posting the patches to 
>>>>>>>>>> catch these errors.
>>>>>>>>>> There is a bot that looks at the fixes tag and identifies the 
>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>> ---
>>>>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>   1 file changed, 8 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device 
>>>>>>>>>>> *vdev,
>>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>>       u32 device_id;
>>>>>>>>>>>       void *hdr;
>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>       int err;
>>>>>>>>>>>
>>>>>>>>>>>       down_read(&vdev->cf_lock);
>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>>> completed");
>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>> -        goto out;
>>>>>>>>>>> -    }
>>>>>>>>>>> -
>>>>>>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, 
>>>>>>>>>>> flags,
>>>>>>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>       if (!hdr) {
>>>>>>>>>>> -- 
>>>>>>>>>>> 2.31.1
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Virtualization mailing list
>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Jason Wang Aug. 2, 2022, 6:33 a.m. UTC | #16
On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
> >
> >
> > On 7/31/2022 9:44 PM, Jason Wang wrote:
> >>
> >> 在 2022/7/30 04:55, Si-Wei Liu 写道:
> >>>
> >>>
> >>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
> >>>>
> >>>>
> >>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
> >>>>>
> >>>>>
> >>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
> >>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't
> >>>>>>>>> get to most emails for this discussion (I only subscribed to
> >>>>>>>>> the virtualization list), while I was taking off amongst the
> >>>>>>>>> past few weeks.
> >>>>>>>>>
> >>>>>>>>> It looks to me this patch is incomplete. Noted down the way in
> >>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
> >>>>>>>>>          features = vdev->config->get_driver_features(vdev);
> >>>>>>>>>          if (nla_put_u64_64bit(msg,
> >>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
> >>>>>>>>>                                VDPA_ATTR_PAD))
> >>>>>>>>>                  return -EMSGSIZE;
> >>>>>>>>>
> >>>>>>>>> Making call to .get_driver_features() doesn't make sense when
> >>>>>>>>> feature negotiation isn't complete. Neither should present
> >>>>>>>>> negotiated_features to userspace before negotiation is done.
> >>>>>>>>>
> >>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
> >>>>>>>>> probably should not show before negotiation is done - it
> >>>>>>>>> depends on driver features negotiated.
> >>>>>>>> I have another patch in this series introduces device_features
> >>>>>>>> and will report device_features to the userspace even features
> >>>>>>>> negotiation not done. Because the spec says we should allow
> >>>>>>>> driver access the config space before FEATURES_OK.
> >>>>>>> The config space can be accessed by guest before features_ok
> >>>>>>> doesn't necessarily mean the value is valid. You may want to
> >>>>>>> double check with Michael for what he quoted earlier:
> >>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
> >>>>>> vDPA kernel should not return a mac to the userspace, there is
> >>>>>> not a default value for mac.
> >>>>> Then please show us the code, as I can only comment based on your
> >>>>> latest (v4) patch and it was not there.. To be honest, I don't
> >>>>> understand the motivation and the use cases you have, is it for
> >>>>> debugging/monitoring or there's really a use case for live
> >>>>> migration? For the former, you can do a direct dump on all config
> >>>>> space fields regardless of endianess and feature negotiation
> >>>>> without having to worry about validity (meaningful to present to
> >>>>> admin user). To me these are conflict asks that is impossible to
> >>>>> mix in exact one command.
> >>>> This bug just has been revealed two days, and you will see the
> >>>> patch soon.
> >>>>
> >>>> There are something to clarify:
> >>>> 1) we need to read the device features, or how can you pick a
> >>>> proper LM destination
> >>
> >>
> >> So it's probably not very efficient to use this, the manager layer
> >> should have the knowledge about the compatibility before doing
> >> migration other than try-and-fail.
> >>
> >> And it's the task of the management to gather the nodes whose devices
> >> could be live migrated to each other as something like "cluster"
> >> which we've already used in the case of cpuflags.
> >>
> >> 1) during node bootstrap, the capability of each node and devices was
> >> reported to management layer
> >> 2) management layer decide the cluster and make sure the migration
> >> can only done among the nodes insides the cluster
> >> 3) before migration, the vDPA needs to be provisioned on the destination
> >>
> >>
> >>>> 2) vdpa dev config show can show both device features and driver
> >>>> features, there just need a patch for iproute2
> >>>> 3) To process information like MQ, we don't just dump the config
> >>>> space, MST has explained before
> >>> So, it's for live migration... Then why not export those config
> >>> parameters specified for vdpa creation (as well as device feature
> >>> bits) to the output of "vdpa dev show" command? That's where device
> >>> side config lives and is static across vdpa's life cycle. "vdpa dev
> >>> config show" is mostly for dynamic driver side config, and the
> >>> validity is subject to feature negotiation. I suppose this should
> >>> suit your need of LM, e.g.
> >>
> >>
> >> I think so.
> >>
> >>
> >>>
> >>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
> >>> $ vdpa dev show vdpa1
> >>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs
> >>> 15 max_vq_size 256
> >>>   max_vqp 7 mtu 2000
> >>>   dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
> >>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
> >>
> >>
> >> Note that the mgmt should know this destination have those
> >> capability/features before the provisioning.
> > Yes, mgmt software should have to check the above from source.
>
> On destination mgmt software can run below to check vdpa mgmtdev's
> capability/features:
>
> $ vdpa mgmtdev show pci/0000:41:04.3
> pci/0000:41:04.3:
>    supported_classes net
>    max_supported_vqs 257
>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ
> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED

Right and this is probably better to be done at node bootstrapping for
the management to know about the cluster.

Thanks

> >
> >>
> >>
> >>>
> >>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
> >>> _vdpa_register_device() during registration, and get it saved there
> >>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
> >>> conditionally subject to "struct vdpa_dev_set_config.mask".
> >>>
> >>> Thanks,
> >>> -Siwei
> >>
> >>
> >> Thanks
> >>
> >>
> >>>>
> >>>> Thanks
> >>>> Zhu Lingshan
> >>>>>
> >>>>>>>> Nope:
> >>>>>>>>
> >>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
> >>>>>>>>
> >>>>>>>> ...
> >>>>>>>>
> >>>>>>>> For optional configuration space fields, the driver MUST check
> >>>>>>>> that the corresponding feature is offered
> >>>>>>>> before accessing that part of the configuration space.
> >>>>>>>
> >>>>>>> and how many driver bugs taking wrong assumption of the validity
> >>>>>>> of config space field without features_ok. I am not sure what
> >>>>>>> use case you want to expose config resister values for before
> >>>>>>> features_ok, if it's mostly for live migration I guess it's
> >>>>>>> probably heading a wrong direction.
> >>>>>>>
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Last but not the least, this "vdpa dev config" command was not
> >>>>>>>>> designed to display the real config space register values in
> >>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
> >>>>>>>>>
> >>>>>>>>>> vdpa dev config show - Show configuration of specific device
> >>>>>>>>>> or all devices.
> >>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If
> >>>>>>>>>> this argument is omitted all devices configuration is listed.
> >>>>>>>>> It doesn't say anything about configuration space or register
> >>>>>>>>> values in config space. As long as it can convey the config
> >>>>>>>>> attribute when instantiating vDPA device instance, and more
> >>>>>>>>> importantly, the config can be easily imported from or
> >>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
> >>>>>>>>> instance intact on destination host for live migration, IMHO
> >>>>>>>>> in my personal interpretation it doesn't matter what the
> >>>>>>>>> config space may present. It may be worth while adding a new
> >>>>>>>>> debug command to expose the real register value, but that's
> >>>>>>>>> another story.
> >>>>>>>> I am not sure getting your points. vDPA now reports device
> >>>>>>>> feature bits(device_features) and negotiated feature
> >>>>>>>> bits(driver_features), and yes, the drivers features can be a
> >>>>>>>> subset of the device features; and the vDPA device features can
> >>>>>>>> be a subset of the management device features.
> >>>>>>> What I said is after unblocking the conditional check, you'd
> >>>>>>> have to handle the case for each of the vdpa attribute when
> >>>>>>> feature negotiation is not yet done: basically the register
> >>>>>>> values you got from config space via the
> >>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
> >>>>>>> before features_ok (per-spec). Although in some case you may get
> >>>>>>> sane value, such behavior is generally undefined. If you desire
> >>>>>>> to show just the device_features alone without any config space
> >>>>>>> field, which the device had advertised *before feature
> >>>>>>> negotiation is complete*, that'll be fine. But looks to me this
> >>>>>>> is not how patch has been implemented. Probably need some more
> >>>>>>> work?
> >>>>>> They are driver_features(negotiated) and the
> >>>>>> device_features(which comes with the device), and the config
> >>>>>> space fields that depend on them. In this series, we report both
> >>>>>> to the userspace.
> >>>>> I fail to understand what you want to present from your
> >>>>> description. May be worth showing some example outputs that at
> >>>>> least include the following cases: 1) when device offers features
> >>>>> but not yet acknowledge by guest 2) when guest acknowledged
> >>>>> features and device is yet to accept 3) after guest feature
> >>>>> negotiation is completed (agreed upon between guest and device).
> >>>> Only two feature sets: 1) what the device has. (2) what is negotiated
> >>>>>
> >>>>> Thanks,
> >>>>> -Siwei
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> -Siwei
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>> Having said, please consider to drop the Fixes tag, as appears
> >>>>>>>>> to me you're proposing a new feature rather than fixing a real
> >>>>>>>>> issue.
> >>>>>>>> it's a new feature to report the device feature bits than only
> >>>>>>>> negotiated features, however this patch is a must, or it will
> >>>>>>>> block the device feature bits reporting. but I agree, the fix
> >>>>>>>> tag is not a must.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> -Siwei
> >>>>>>>>>
> >>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
> >>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
> >>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
> >>>>>>>>>>>
> >>>>>>>>>>> Users may want to query the config space of a vDPA device,
> >>>>>>>>>>> to choose a
> >>>>>>>>>>> appropriate one for a certain guest. This means the users
> >>>>>>>>>>> need to read the
> >>>>>>>>>>> config space before FEATURES_OK, and the existence of config
> >>>>>>>>>>> space
> >>>>>>>>>>> contents does not depend on FEATURES_OK.
> >>>>>>>>>>>
> >>>>>>>>>>> The spec says:
> >>>>>>>>>>> The device MUST allow reading of any device-specific
> >>>>>>>>>>> configuration field
> >>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
> >>>>>>>>>>> fields which are
> >>>>>>>>>>> conditional on feature bits, as long as those feature bits
> >>>>>>>>>>> are offered by the
> >>>>>>>>>>> device.
> >>>>>>>>>>>
> >>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
> >>>>>>>>>>> if FEATURES_OK)
> >>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
> >>>>>>>>>>
> >>>>>>>>>> Above commit id is 13 letters should be 12.
> >>>>>>>>>> And
> >>>>>>>>>> It should be in format
> >>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
> >>>>>>>>>> FEATURES_OK")
> >>>>>>>>>>
> >>>>>>>>>> Please use checkpatch.pl script before posting the patches to
> >>>>>>>>>> catch these errors.
> >>>>>>>>>> There is a bot that looks at the fixes tag and identifies the
> >>>>>>>>>> right kernel version to apply this fix.
> >>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
> >>>>>>>>>>> ---
> >>>>>>>>>>>   drivers/vdpa/vdpa.c | 8 --------
> >>>>>>>>>>>   1 file changed, 8 deletions(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> >>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
> >>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
> >>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
> >>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device
> >>>>>>>>>>> *vdev,
> >>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
> >>>>>>>>>>>       u32 device_id;
> >>>>>>>>>>>       void *hdr;
> >>>>>>>>>>> -    u8 status;
> >>>>>>>>>>>       int err;
> >>>>>>>>>>>
> >>>>>>>>>>>       down_read(&vdev->cf_lock);
> >>>>>>>>>>> -    status = vdev->config->get_status(vdev);
> >>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
> >>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
> >>>>>>>>>>> completed");
> >>>>>>>>>>> -        err = -EAGAIN;
> >>>>>>>>>>> -        goto out;
> >>>>>>>>>>> -    }
> >>>>>>>>>>> -
> >>>>>>>>>>>       hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
> >>>>>>>>>>> flags,
> >>>>>>>>>>>                 VDPA_CMD_DEV_CONFIG_GET);
> >>>>>>>>>>>       if (!hdr) {
> >>>>>>>>>>> --
> >>>>>>>>>>> 2.31.1
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> Virtualization mailing list
> >>>>>>>>>> Virtualization@lists.linux-foundation.org
> >>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>
Si-Wei Liu Aug. 3, 2022, 1:26 a.m. UTC | #17
On 8/1/2022 11:33 PM, Jason Wang wrote:
> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>
>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>
>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>
>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>
>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>
>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>
>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't
>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>> past few weeks.
>>>>>>>>>>>
>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the way in
>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>                                 VDPA_ATTR_PAD))
>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>
>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense when
>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>
>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>> I have another patch in this series introduces device_features
>>>>>>>>>> and will report device_features to the userspace even features
>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>> not a default value for mac.
>>>>>>> Then please show us the code, as I can only comment based on your
>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>> migration? For the former, you can do a direct dump on all config
>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>> mix in exact one command.
>>>>>> This bug just has been revealed two days, and you will see the
>>>>>> patch soon.
>>>>>>
>>>>>> There are something to clarify:
>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>> proper LM destination
>>>>
>>>> So it's probably not very efficient to use this, the manager layer
>>>> should have the knowledge about the compatibility before doing
>>>> migration other than try-and-fail.
>>>>
>>>> And it's the task of the management to gather the nodes whose devices
>>>> could be live migrated to each other as something like "cluster"
>>>> which we've already used in the case of cpuflags.
>>>>
>>>> 1) during node bootstrap, the capability of each node and devices was
>>>> reported to management layer
>>>> 2) management layer decide the cluster and make sure the migration
>>>> can only done among the nodes insides the cluster
>>>> 3) before migration, the vDPA needs to be provisioned on the destination
>>>>
>>>>
>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>> features, there just need a patch for iproute2
>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>> space, MST has explained before
>>>>> So, it's for live migration... Then why not export those config
>>>>> parameters specified for vdpa creation (as well as device feature
>>>>> bits) to the output of "vdpa dev show" command? That's where device
>>>>> side config lives and is static across vdpa's life cycle. "vdpa dev
>>>>> config show" is mostly for dynamic driver side config, and the
>>>>> validity is subject to feature negotiation. I suppose this should
>>>>> suit your need of LM, e.g.
>>>>
>>>> I think so.
>>>>
>>>>
>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 2000
>>>>> $ vdpa dev show vdpa1
>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs
>>>>> 15 max_vq_size 256
>>>>>    max_vqp 7 mtu 2000
>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>
>>>> Note that the mgmt should know this destination have those
>>>> capability/features before the provisioning.
>>> Yes, mgmt software should have to check the above from source.
>> On destination mgmt software can run below to check vdpa mgmtdev's
>> capability/features:
>>
>> $ vdpa mgmtdev show pci/0000:41:04.3
>> pci/0000:41:04.3:
>>     supported_classes net
>>     max_supported_vqs 257
>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ
>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
> Right and this is probably better to be done at node bootstrapping for
> the management to know about the cluster.
Exactly. That's what mgmt software is supposed to do typically.

Thanks,
-Siwei

>
> Thanks
>
>>>>
>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>> _vdpa_register_device() during registration, and get it saved there
>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>
>>>>> Thanks,
>>>>> -Siwei
>>>>
>>>> Thanks
>>>>
>>>>
>>>>>> Thanks
>>>>>> Zhu Lingshan
>>>>>>>>>> Nope:
>>>>>>>>>>
>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> For optional configuration space fields, the driver MUST check
>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>> and how many driver bugs taking wrong assumption of the validity
>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Last but not the least, this "vdpa dev config" command was not
>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>
>>>>>>>>>>>> vdpa dev config show - Show configuration of specific device
>>>>>>>>>>>> or all devices.
>>>>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If
>>>>>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>>>>>> It doesn't say anything about configuration space or register
>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>> another story.
>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>> subset of the device features; and the vDPA device features can
>>>>>>>>>> be a subset of the management device features.
>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>> values you got from config space via the
>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>> before features_ok (per-spec). Although in some case you may get
>>>>>>>>> sane value, such behavior is generally undefined. If you desire
>>>>>>>>> to show just the device_features alone without any config space
>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me this
>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>> work?
>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>> space fields that depend on them. In this series, we report both
>>>>>>>> to the userspace.
>>>>>>> I fail to understand what you want to present from your
>>>>>>> description. May be worth showing some example outputs that at
>>>>>>> least include the following cases: 1) when device offers features
>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>> Only two feature sets: 1) what the device has. (2) what is negotiated
>>>>>>> Thanks,
>>>>>>> -Siwei
>>>>>>>>> Regards,
>>>>>>>>> -Siwei
>>>>>>>>>
>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as appears
>>>>>>>>>>> to me you're proposing a new feature rather than fixing a real
>>>>>>>>>>> issue.
>>>>>>>>>> it's a new feature to report the device feature bits than only
>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>> tag is not a must.
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -Siwei
>>>>>>>>>>>
>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>
>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of config
>>>>>>>>>>>>> space
>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>> device.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>>>>
>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>> And
>>>>>>>>>>>> It should be in format
>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>
>>>>>>>>>>>> Please use checkpatch.pl script before posting the patches to
>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>> There is a bot that looks at the fixes tag and identifies the
>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device
>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>
>>>>>>>>>>>>>        down_read(&vdev->cf_lock);
>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>>>>> completed");
>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>> -    }
>>>>>>>>>>>>> -
>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>                  VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>> --
>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$
>>>>>>>>>>>
>>>>>>>>>>>
Zhu, Lingshan Aug. 3, 2022, 2:30 a.m. UTC | #18
On 8/3/2022 9:26 AM, Si-Wei Liu wrote:
>
>
> On 8/1/2022 11:33 PM, Jason Wang wrote:
>> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>>
>>>
>>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>>
>>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>>
>>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>>
>>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>>
>>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>>
>>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>>> Sorry to chime in late in the game. For some reason I couldn't
>>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>>> past few weeks.
>>>>>>>>>>>>
>>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the way in
>>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>> VDPA_ATTR_PAD))
>>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>>
>>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense when
>>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>>
>>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>>> I have another patch in this series introduces device_features
>>>>>>>>>>> and will report device_features to the userspace even features
>>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>>> not a default value for mac.
>>>>>>>> Then please show us the code, as I can only comment based on your
>>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>>> migration? For the former, you can do a direct dump on all config
>>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>>> mix in exact one command.
>>>>>>> This bug just has been revealed two days, and you will see the
>>>>>>> patch soon.
>>>>>>>
>>>>>>> There are something to clarify:
>>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>>> proper LM destination
>>>>>
>>>>> So it's probably not very efficient to use this, the manager layer
>>>>> should have the knowledge about the compatibility before doing
>>>>> migration other than try-and-fail.
>>>>>
>>>>> And it's the task of the management to gather the nodes whose devices
>>>>> could be live migrated to each other as something like "cluster"
>>>>> which we've already used in the case of cpuflags.
>>>>>
>>>>> 1) during node bootstrap, the capability of each node and devices was
>>>>> reported to management layer
>>>>> 2) management layer decide the cluster and make sure the migration
>>>>> can only done among the nodes insides the cluster
>>>>> 3) before migration, the vDPA needs to be provisioned on the 
>>>>> destination
>>>>>
>>>>>
>>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>>> features, there just need a patch for iproute2
>>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>>> space, MST has explained before
>>>>>> So, it's for live migration... Then why not export those config
>>>>>> parameters specified for vdpa creation (as well as device feature
>>>>>> bits) to the output of "vdpa dev show" command? That's where device
>>>>>> side config lives and is static across vdpa's life cycle. "vdpa dev
>>>>>> config show" is mostly for dynamic driver side config, and the
>>>>>> validity is subject to feature negotiation. I suppose this should
>>>>>> suit your need of LM, e.g.
>>>>>
>>>>> I think so.
>>>>>
>>>>>
>>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 
>>>>>> 2000
>>>>>> $ vdpa dev show vdpa1
>>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs
>>>>>> 15 max_vq_size 256
>>>>>>    max_vqp 7 mtu 2000
>>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>>
>>>>> Note that the mgmt should know this destination have those
>>>>> capability/features before the provisioning.
>>>> Yes, mgmt software should have to check the above from source.
>>> On destination mgmt software can run below to check vdpa mgmtdev's
>>> capability/features:
>>>
>>> $ vdpa mgmtdev show pci/0000:41:04.3
>>> pci/0000:41:04.3:
>>>     supported_classes net
>>>     max_supported_vqs 257
>>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS CTRL_VQ
>>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>> Right and this is probably better to be done at node bootstrapping for
>> the management to know about the cluster.
> Exactly. That's what mgmt software is supposed to do typically.
I think this could apply to both mgmt devices and vDPA devices:
1)mgmt device, see whether the mgmt device is capable to create a vDPA 
device with a certain feature bits, this is for LM
2)vDPA device, report the device features, it is for normal operation

Thanks,
Zhu Lingshan
>
> Thanks,
> -Siwei
>
>>
>> Thanks
>>
>>>>>
>>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>>> _vdpa_register_device() during registration, and get it saved there
>>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>>
>>>>>> Thanks,
>>>>>> -Siwei
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>>> Thanks
>>>>>>> Zhu Lingshan
>>>>>>>>>>> Nope:
>>>>>>>>>>>
>>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>>
>>>>>>>>>>> ...
>>>>>>>>>>>
>>>>>>>>>>> For optional configuration space fields, the driver MUST check
>>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>>> and how many driver bugs taking wrong assumption of the validity
>>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Last but not the least, this "vdpa dev config" command was not
>>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>>
>>>>>>>>>>>>> vdpa dev config show - Show configuration of specific device
>>>>>>>>>>>>> or all devices.
>>>>>>>>>>>>> DEV - specifies the vdpa device to show its configuration. If
>>>>>>>>>>>>> this argument is omitted all devices configuration is listed.
>>>>>>>>>>>> It doesn't say anything about configuration space or register
>>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>>> another story.
>>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>>> subset of the device features; and the vDPA device features can
>>>>>>>>>>> be a subset of the management device features.
>>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>>> values you got from config space via the
>>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>>> before features_ok (per-spec). Although in some case you may get
>>>>>>>>>> sane value, such behavior is generally undefined. If you desire
>>>>>>>>>> to show just the device_features alone without any config space
>>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me this
>>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>>> work?
>>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>>> space fields that depend on them. In this series, we report both
>>>>>>>>> to the userspace.
>>>>>>>> I fail to understand what you want to present from your
>>>>>>>> description. May be worth showing some example outputs that at
>>>>>>>> least include the following cases: 1) when device offers features
>>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>>> Only two feature sets: 1) what the device has. (2) what is 
>>>>>>> negotiated
>>>>>>>> Thanks,
>>>>>>>> -Siwei
>>>>>>>>>> Regards,
>>>>>>>>>> -Siwei
>>>>>>>>>>
>>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as appears
>>>>>>>>>>>> to me you're proposing a new feature rather than fixing a real
>>>>>>>>>>>> issue.
>>>>>>>>>>> it's a new feature to report the device feature bits than only
>>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>>> tag is not a must.
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of config
>>>>>>>>>>>>>> space
>>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>>> device.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>>> And
>>>>>>>>>>>>> It should be in format
>>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration only if
>>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please use checkpatch.pl script before posting the patches to
>>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>>> There is a bot that looks at the fixes tag and identifies the
>>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct vdpa_device
>>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> down_read(&vdev->cf_lock);
>>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features negotiation 
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>> completed");
>>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>>> -    }
>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
>>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>
Si-Wei Liu Aug. 3, 2022, 11:09 p.m. UTC | #19
On 8/2/2022 7:30 PM, Zhu, Lingshan wrote:
>
>
> On 8/3/2022 9:26 AM, Si-Wei Liu wrote:
>>
>>
>> On 8/1/2022 11:33 PM, Jason Wang wrote:
>>> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> 
>>> wrote:
>>>>
>>>>
>>>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>>>
>>>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>>>
>>>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>>>
>>>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>>>
>>>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>>>> Sorry to chime in late in the game. For some reason I 
>>>>>>>>>>>>> couldn't
>>>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>>>> past few weeks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the 
>>>>>>>>>>>>> way in
>>>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>>>           features = vdev->config->get_driver_features(vdev);
>>>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>>> VDPA_ATTR_PAD))
>>>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>>>
>>>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense when
>>>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>>>> I have another patch in this series introduces device_features
>>>>>>>>>>>> and will report device_features to the userspace even features
>>>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>>>> not a default value for mac.
>>>>>>>>> Then please show us the code, as I can only comment based on your
>>>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>>>> migration? For the former, you can do a direct dump on all config
>>>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>>>> mix in exact one command.
>>>>>>>> This bug just has been revealed two days, and you will see the
>>>>>>>> patch soon.
>>>>>>>>
>>>>>>>> There are something to clarify:
>>>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>>>> proper LM destination
>>>>>>
>>>>>> So it's probably not very efficient to use this, the manager layer
>>>>>> should have the knowledge about the compatibility before doing
>>>>>> migration other than try-and-fail.
>>>>>>
>>>>>> And it's the task of the management to gather the nodes whose 
>>>>>> devices
>>>>>> could be live migrated to each other as something like "cluster"
>>>>>> which we've already used in the case of cpuflags.
>>>>>>
>>>>>> 1) during node bootstrap, the capability of each node and devices 
>>>>>> was
>>>>>> reported to management layer
>>>>>> 2) management layer decide the cluster and make sure the migration
>>>>>> can only done among the nodes insides the cluster
>>>>>> 3) before migration, the vDPA needs to be provisioned on the 
>>>>>> destination
>>>>>>
>>>>>>
>>>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>>>> features, there just need a patch for iproute2
>>>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>>>> space, MST has explained before
>>>>>>> So, it's for live migration... Then why not export those config
>>>>>>> parameters specified for vdpa creation (as well as device feature
>>>>>>> bits) to the output of "vdpa dev show" command? That's where device
>>>>>>> side config lives and is static across vdpa's life cycle. "vdpa dev
>>>>>>> config show" is mostly for dynamic driver side config, and the
>>>>>>> validity is subject to feature negotiation. I suppose this should
>>>>>>> suit your need of LM, e.g.
>>>>>>
>>>>>> I think so.
>>>>>>
>>>>>>
>>>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 mtu 
>>>>>>> 2000
>>>>>>> $ vdpa dev show vdpa1
>>>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 max_vqs
>>>>>>> 15 max_vq_size 256
>>>>>>>    max_vqp 7 mtu 2000
>>>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>>>
>>>>>> Note that the mgmt should know this destination have those
>>>>>> capability/features before the provisioning.
>>>>> Yes, mgmt software should have to check the above from source.
>>>> On destination mgmt software can run below to check vdpa mgmtdev's
>>>> capability/features:
>>>>
>>>> $ vdpa mgmtdev show pci/0000:41:04.3
>>>> pci/0000:41:04.3:
>>>>     supported_classes net
>>>>     max_supported_vqs 257
>>>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS 
>>>> CTRL_VQ
>>>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>> Right and this is probably better to be done at node bootstrapping for
>>> the management to know about the cluster.
>> Exactly. That's what mgmt software is supposed to do typically.
> I think this could apply to both mgmt devices and vDPA devices:
> 1)mgmt device, see whether the mgmt device is capable to create a vDPA 
> device with a certain feature bits, this is for LM
> 2)vDPA device, report the device features, it is for normal operation
Can you elaborate the use case "for normal operations"? Then it has 
nothing to do with LM for sure, correct?

Noted for the LM case, just as Jason indicated, it's not even *required* 
for the mgmt software to gather the device features through "vdpa dev 
show" on source host *alive* right before live migration is started. 
Depending on the way how it is implemented, the mgmt software can well 
collect device capability on boot strap time, or may well save the vdpa 
device capability/config in persistent store ahead of time, say before 
any VM is to be launched. Then with all such info collected for each 
cluster node, mgmt software is able to get its own way to infer and sort 
out the live migration compatibility between nodes. I'm not sure which 
case you would need to check the device features, but in case you need 
it, it'd be better live in "vdpa dev show" than "vdpa dev config show".

Thanks,
-Siwei

>
> Thanks,
> Zhu Lingshan
>>
>> Thanks,
>> -Siwei
>>
>>>
>>> Thanks
>>>
>>>>>>
>>>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>>>> _vdpa_register_device() during registration, and get it saved there
>>>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>>>
>>>>>>> Thanks,
>>>>>>> -Siwei
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>>> Thanks
>>>>>>>> Zhu Lingshan
>>>>>>>>>>>> Nope:
>>>>>>>>>>>>
>>>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>>>
>>>>>>>>>>>> ...
>>>>>>>>>>>>
>>>>>>>>>>>> For optional configuration space fields, the driver MUST check
>>>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>>>> and how many driver bugs taking wrong assumption of the 
>>>>>>>>>>> validity
>>>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Last but not the least, this "vdpa dev config" command was 
>>>>>>>>>>>>> not
>>>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> vdpa dev config show - Show configuration of specific device
>>>>>>>>>>>>>> or all devices.
>>>>>>>>>>>>>> DEV - specifies the vdpa device to show its 
>>>>>>>>>>>>>> configuration. If
>>>>>>>>>>>>>> this argument is omitted all devices configuration is 
>>>>>>>>>>>>>> listed.
>>>>>>>>>>>>> It doesn't say anything about configuration space or register
>>>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>>>> another story.
>>>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>>>> subset of the device features; and the vDPA device features 
>>>>>>>>>>>> can
>>>>>>>>>>>> be a subset of the management device features.
>>>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>>>> values you got from config space via the
>>>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>>>> before features_ok (per-spec). Although in some case you may 
>>>>>>>>>>> get
>>>>>>>>>>> sane value, such behavior is generally undefined. If you desire
>>>>>>>>>>> to show just the device_features alone without any config space
>>>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me this
>>>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>>>> work?
>>>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>>>> space fields that depend on them. In this series, we report both
>>>>>>>>>> to the userspace.
>>>>>>>>> I fail to understand what you want to present from your
>>>>>>>>> description. May be worth showing some example outputs that at
>>>>>>>>> least include the following cases: 1) when device offers features
>>>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>>>> Only two feature sets: 1) what the device has. (2) what is 
>>>>>>>> negotiated
>>>>>>>>> Thanks,
>>>>>>>>> -Siwei
>>>>>>>>>>> Regards,
>>>>>>>>>>> -Siwei
>>>>>>>>>>>
>>>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as 
>>>>>>>>>>>>> appears
>>>>>>>>>>>>> to me you're proposing a new feature rather than fixing a 
>>>>>>>>>>>>> real
>>>>>>>>>>>>> issue.
>>>>>>>>>>>> it's a new feature to report the device feature bits than only
>>>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>>>> tag is not a must.
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of 
>>>>>>>>>>>>>>> config
>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>>>> device.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described below.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>>>> And
>>>>>>>>>>>>>> It should be in format
>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration 
>>>>>>>>>>>>>> only if
>>>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please use checkpatch.pl script before posting the 
>>>>>>>>>>>>>> patches to
>>>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>>>> There is a bot that looks at the fixes tag and identifies 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c 
>>>>>>>>>>>>>>> index
>>>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct 
>>>>>>>>>>>>>>> vdpa_device
>>>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>>>> struct sk_buff *msg, u32 portid,  {
>>>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> down_read(&vdev->cf_lock);
>>>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>>>> -        NL_SET_ERR_MSG_MOD(extack, "Features 
>>>>>>>>>>>>>>> negotiation not
>>>>>>>>>>>>>>> completed");
>>>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>>>> -    }
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family,
>>>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>
>
Zhu, Lingshan Aug. 4, 2022, 1:41 a.m. UTC | #20
On 8/4/2022 7:09 AM, Si-Wei Liu wrote:
>
>
> On 8/2/2022 7:30 PM, Zhu, Lingshan wrote:
>>
>>
>> On 8/3/2022 9:26 AM, Si-Wei Liu wrote:
>>>
>>>
>>> On 8/1/2022 11:33 PM, Jason Wang wrote:
>>>> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> 
>>>> wrote:
>>>>>
>>>>>
>>>>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>>>>
>>>>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>>>>
>>>>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>>>>
>>>>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>>>>> Sorry to chime in late in the game. For some reason I 
>>>>>>>>>>>>>> couldn't
>>>>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>>>>> past few weeks.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the 
>>>>>>>>>>>>>> way in
>>>>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>>>>           features = 
>>>>>>>>>>>>>> vdev->config->get_driver_features(vdev);
>>>>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>>>> VDPA_ATTR_PAD))
>>>>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense 
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>>>>> I have another patch in this series introduces 
>>>>>>>>>>>>> device_features
>>>>>>>>>>>>> and will report device_features to the userspace even 
>>>>>>>>>>>>> features
>>>>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>>>>> not a default value for mac.
>>>>>>>>>> Then please show us the code, as I can only comment based on 
>>>>>>>>>> your
>>>>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>>>>> migration? For the former, you can do a direct dump on all 
>>>>>>>>>> config
>>>>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>>>>> mix in exact one command.
>>>>>>>>> This bug just has been revealed two days, and you will see the
>>>>>>>>> patch soon.
>>>>>>>>>
>>>>>>>>> There are something to clarify:
>>>>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>>>>> proper LM destination
>>>>>>>
>>>>>>> So it's probably not very efficient to use this, the manager layer
>>>>>>> should have the knowledge about the compatibility before doing
>>>>>>> migration other than try-and-fail.
>>>>>>>
>>>>>>> And it's the task of the management to gather the nodes whose 
>>>>>>> devices
>>>>>>> could be live migrated to each other as something like "cluster"
>>>>>>> which we've already used in the case of cpuflags.
>>>>>>>
>>>>>>> 1) during node bootstrap, the capability of each node and 
>>>>>>> devices was
>>>>>>> reported to management layer
>>>>>>> 2) management layer decide the cluster and make sure the migration
>>>>>>> can only done among the nodes insides the cluster
>>>>>>> 3) before migration, the vDPA needs to be provisioned on the 
>>>>>>> destination
>>>>>>>
>>>>>>>
>>>>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>>>>> features, there just need a patch for iproute2
>>>>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>>>>> space, MST has explained before
>>>>>>>> So, it's for live migration... Then why not export those config
>>>>>>>> parameters specified for vdpa creation (as well as device feature
>>>>>>>> bits) to the output of "vdpa dev show" command? That's where 
>>>>>>>> device
>>>>>>>> side config lives and is static across vdpa's life cycle. "vdpa 
>>>>>>>> dev
>>>>>>>> config show" is mostly for dynamic driver side config, and the
>>>>>>>> validity is subject to feature negotiation. I suppose this should
>>>>>>>> suit your need of LM, e.g.
>>>>>>>
>>>>>>> I think so.
>>>>>>>
>>>>>>>
>>>>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 
>>>>>>>> mtu 2000
>>>>>>>> $ vdpa dev show vdpa1
>>>>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 
>>>>>>>> max_vqs
>>>>>>>> 15 max_vq_size 256
>>>>>>>>    max_vqp 7 mtu 2000
>>>>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>>>>
>>>>>>> Note that the mgmt should know this destination have those
>>>>>>> capability/features before the provisioning.
>>>>>> Yes, mgmt software should have to check the above from source.
>>>>> On destination mgmt software can run below to check vdpa mgmtdev's
>>>>> capability/features:
>>>>>
>>>>> $ vdpa mgmtdev show pci/0000:41:04.3
>>>>> pci/0000:41:04.3:
>>>>>     supported_classes net
>>>>>     max_supported_vqs 257
>>>>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS 
>>>>> CTRL_VQ
>>>>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>> Right and this is probably better to be done at node bootstrapping for
>>>> the management to know about the cluster.
>>> Exactly. That's what mgmt software is supposed to do typically.
>> I think this could apply to both mgmt devices and vDPA devices:
>> 1)mgmt device, see whether the mgmt device is capable to create a 
>> vDPA device with a certain feature bits, this is for LM
>> 2)vDPA device, report the device features, it is for normal operation
> Can you elaborate the use case "for normal operations"? Then it has 
> nothing to do with LM for sure, correct?
like when you just want to check the features to pick a proper device
>
> Noted for the LM case, just as Jason indicated, it's not even 
> *required* for the mgmt software to gather the device features through 
> "vdpa dev show" on source host *alive* right before live migration is 
> started. Depending on the way how it is implemented, the mgmt software 
> can well collect device capability on boot strap time, or may well 
> save the vdpa device capability/config in persistent store ahead of 
> time, say before any VM is to be launched. Then with all such info 
> collected for each cluster node, mgmt software is able to get its own 
> way to infer and sort out the live migration compatibility between 
> nodes. I'm not sure which case you would need to check the device 
> features, but in case you need it, it'd be better live in "vdpa dev 
> show" than "vdpa dev config show".
it not only for LM
>
> Thanks,
> -Siwei
>
>>
>> Thanks,
>> Zhu Lingshan
>>>
>>> Thanks,
>>> -Siwei
>>>
>>>>
>>>> Thanks
>>>>
>>>>>>>
>>>>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>>>>> _vdpa_register_device() during registration, and get it saved 
>>>>>>>> there
>>>>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -Siwei
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Zhu Lingshan
>>>>>>>>>>>>> Nope:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>>>>
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> For optional configuration space fields, the driver MUST 
>>>>>>>>>>>>> check
>>>>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>>>>> and how many driver bugs taking wrong assumption of the 
>>>>>>>>>>>> validity
>>>>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Last but not the least, this "vdpa dev config" command 
>>>>>>>>>>>>>> was not
>>>>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> vdpa dev config show - Show configuration of specific 
>>>>>>>>>>>>>>> device
>>>>>>>>>>>>>>> or all devices.
>>>>>>>>>>>>>>> DEV - specifies the vdpa device to show its 
>>>>>>>>>>>>>>> configuration. If
>>>>>>>>>>>>>>> this argument is omitted all devices configuration is 
>>>>>>>>>>>>>>> listed.
>>>>>>>>>>>>>> It doesn't say anything about configuration space or 
>>>>>>>>>>>>>> register
>>>>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>>>>> another story.
>>>>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>>>>> subset of the device features; and the vDPA device 
>>>>>>>>>>>>> features can
>>>>>>>>>>>>> be a subset of the management device features.
>>>>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>>>>> values you got from config space via the
>>>>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>>>>> before features_ok (per-spec). Although in some case you 
>>>>>>>>>>>> may get
>>>>>>>>>>>> sane value, such behavior is generally undefined. If you 
>>>>>>>>>>>> desire
>>>>>>>>>>>> to show just the device_features alone without any config 
>>>>>>>>>>>> space
>>>>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me 
>>>>>>>>>>>> this
>>>>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>>>>> work?
>>>>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>>>>> space fields that depend on them. In this series, we report 
>>>>>>>>>>> both
>>>>>>>>>>> to the userspace.
>>>>>>>>>> I fail to understand what you want to present from your
>>>>>>>>>> description. May be worth showing some example outputs that at
>>>>>>>>>> least include the following cases: 1) when device offers 
>>>>>>>>>> features
>>>>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>>>>> Only two feature sets: 1) what the device has. (2) what is 
>>>>>>>>> negotiated
>>>>>>>>>> Thanks,
>>>>>>>>>> -Siwei
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>
>>>>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as 
>>>>>>>>>>>>>> appears
>>>>>>>>>>>>>> to me you're proposing a new feature rather than fixing a 
>>>>>>>>>>>>>> real
>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>> it's a new feature to report the device feature bits than 
>>>>>>>>>>>>> only
>>>>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>>>>> tag is not a must.
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of 
>>>>>>>>>>>>>>>> config
>>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>>>>> device.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described 
>>>>>>>>>>>>>>> below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>>>>> And
>>>>>>>>>>>>>>> It should be in format
>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration 
>>>>>>>>>>>>>>> only if
>>>>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please use checkpatch.pl script before posting the 
>>>>>>>>>>>>>>> patches to
>>>>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>>>>> There is a bot that looks at the fixes tag and 
>>>>>>>>>>>>>>> identifies the
>>>>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c 
>>>>>>>>>>>>>>>> index
>>>>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct 
>>>>>>>>>>>>>>>> vdpa_device
>>>>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>>>>> struct sk_buff *msg, u32 portid, {
>>>>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> down_read(&vdev->cf_lock);
>>>>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>>>>> - NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>>>>>>>> completed");
>>>>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>>>>> -    }
>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, 
>>>>>>>>>>>>>>>> &vdpa_nl_family,
>>>>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>
>>
>
Zhu, Lingshan Aug. 4, 2022, 1:41 a.m. UTC | #21
On 8/4/2022 7:09 AM, Si-Wei Liu wrote:
>
>
> On 8/2/2022 7:30 PM, Zhu, Lingshan wrote:
>>
>>
>> On 8/3/2022 9:26 AM, Si-Wei Liu wrote:
>>>
>>>
>>> On 8/1/2022 11:33 PM, Jason Wang wrote:
>>>> On Tue, Aug 2, 2022 at 6:58 AM Si-Wei Liu <si-wei.liu@oracle.com> 
>>>> wrote:
>>>>>
>>>>>
>>>>> On 8/1/2022 3:53 PM, Si-Wei Liu wrote:
>>>>>>
>>>>>> On 7/31/2022 9:44 PM, Jason Wang wrote:
>>>>>>> 在 2022/7/30 04:55, Si-Wei Liu 写道:
>>>>>>>>
>>>>>>>> On 7/28/2022 7:04 PM, Zhu, Lingshan wrote:
>>>>>>>>>
>>>>>>>>> On 7/29/2022 5:48 AM, Si-Wei Liu wrote:
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 7:43 PM, Zhu, Lingshan wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 7/28/2022 8:56 AM, Si-Wei Liu wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/27/2022 4:47 AM, Zhu, Lingshan wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/27/2022 5:43 PM, Si-Wei Liu wrote:
>>>>>>>>>>>>>> Sorry to chime in late in the game. For some reason I 
>>>>>>>>>>>>>> couldn't
>>>>>>>>>>>>>> get to most emails for this discussion (I only subscribed to
>>>>>>>>>>>>>> the virtualization list), while I was taking off amongst the
>>>>>>>>>>>>>> past few weeks.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks to me this patch is incomplete. Noted down the 
>>>>>>>>>>>>>> way in
>>>>>>>>>>>>>> vdpa_dev_net_config_fill(), we have the following:
>>>>>>>>>>>>>>           features = 
>>>>>>>>>>>>>> vdev->config->get_driver_features(vdev);
>>>>>>>>>>>>>>           if (nla_put_u64_64bit(msg,
>>>>>>>>>>>>>> VDPA_ATTR_DEV_NEGOTIATED_FEATURES, features,
>>>>>>>>>>>>>> VDPA_ATTR_PAD))
>>>>>>>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Making call to .get_driver_features() doesn't make sense 
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> feature negotiation isn't complete. Neither should present
>>>>>>>>>>>>>> negotiated_features to userspace before negotiation is done.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Similarly, max_vqp through vdpa_dev_net_mq_config_fill()
>>>>>>>>>>>>>> probably should not show before negotiation is done - it
>>>>>>>>>>>>>> depends on driver features negotiated.
>>>>>>>>>>>>> I have another patch in this series introduces 
>>>>>>>>>>>>> device_features
>>>>>>>>>>>>> and will report device_features to the userspace even 
>>>>>>>>>>>>> features
>>>>>>>>>>>>> negotiation not done. Because the spec says we should allow
>>>>>>>>>>>>> driver access the config space before FEATURES_OK.
>>>>>>>>>>>> The config space can be accessed by guest before features_ok
>>>>>>>>>>>> doesn't necessarily mean the value is valid. You may want to
>>>>>>>>>>>> double check with Michael for what he quoted earlier:
>>>>>>>>>>> that's why I proposed to fix these issues, e.g., if no _F_MAC,
>>>>>>>>>>> vDPA kernel should not return a mac to the userspace, there is
>>>>>>>>>>> not a default value for mac.
>>>>>>>>>> Then please show us the code, as I can only comment based on 
>>>>>>>>>> your
>>>>>>>>>> latest (v4) patch and it was not there.. To be honest, I don't
>>>>>>>>>> understand the motivation and the use cases you have, is it for
>>>>>>>>>> debugging/monitoring or there's really a use case for live
>>>>>>>>>> migration? For the former, you can do a direct dump on all 
>>>>>>>>>> config
>>>>>>>>>> space fields regardless of endianess and feature negotiation
>>>>>>>>>> without having to worry about validity (meaningful to present to
>>>>>>>>>> admin user). To me these are conflict asks that is impossible to
>>>>>>>>>> mix in exact one command.
>>>>>>>>> This bug just has been revealed two days, and you will see the
>>>>>>>>> patch soon.
>>>>>>>>>
>>>>>>>>> There are something to clarify:
>>>>>>>>> 1) we need to read the device features, or how can you pick a
>>>>>>>>> proper LM destination
>>>>>>>
>>>>>>> So it's probably not very efficient to use this, the manager layer
>>>>>>> should have the knowledge about the compatibility before doing
>>>>>>> migration other than try-and-fail.
>>>>>>>
>>>>>>> And it's the task of the management to gather the nodes whose 
>>>>>>> devices
>>>>>>> could be live migrated to each other as something like "cluster"
>>>>>>> which we've already used in the case of cpuflags.
>>>>>>>
>>>>>>> 1) during node bootstrap, the capability of each node and 
>>>>>>> devices was
>>>>>>> reported to management layer
>>>>>>> 2) management layer decide the cluster and make sure the migration
>>>>>>> can only done among the nodes insides the cluster
>>>>>>> 3) before migration, the vDPA needs to be provisioned on the 
>>>>>>> destination
>>>>>>>
>>>>>>>
>>>>>>>>> 2) vdpa dev config show can show both device features and driver
>>>>>>>>> features, there just need a patch for iproute2
>>>>>>>>> 3) To process information like MQ, we don't just dump the config
>>>>>>>>> space, MST has explained before
>>>>>>>> So, it's for live migration... Then why not export those config
>>>>>>>> parameters specified for vdpa creation (as well as device feature
>>>>>>>> bits) to the output of "vdpa dev show" command? That's where 
>>>>>>>> device
>>>>>>>> side config lives and is static across vdpa's life cycle. "vdpa 
>>>>>>>> dev
>>>>>>>> config show" is mostly for dynamic driver side config, and the
>>>>>>>> validity is subject to feature negotiation. I suppose this should
>>>>>>>> suit your need of LM, e.g.
>>>>>>>
>>>>>>> I think so.
>>>>>>>
>>>>>>>
>>>>>>>> $ vdpa dev add name vdpa1 mgmtdev pci/0000:41:04.2 max_vqp 7 
>>>>>>>> mtu 2000
>>>>>>>> $ vdpa dev show vdpa1
>>>>>>>> vdpa1: type network mgmtdev pci/0000:41:04.2 vendor_id 5555 
>>>>>>>> max_vqs
>>>>>>>> 15 max_vq_size 256
>>>>>>>>    max_vqp 7 mtu 2000
>>>>>>>>    dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS
>>>>>>>> CTRL_VQ MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>>>>>
>>>>>>> Note that the mgmt should know this destination have those
>>>>>>> capability/features before the provisioning.
>>>>>> Yes, mgmt software should have to check the above from source.
>>>>> On destination mgmt software can run below to check vdpa mgmtdev's
>>>>> capability/features:
>>>>>
>>>>> $ vdpa mgmtdev show pci/0000:41:04.3
>>>>> pci/0000:41:04.3:
>>>>>     supported_classes net
>>>>>     max_supported_vqs 257
>>>>>     dev_features CSUM GUEST_CSUM MTU HOST_TSO4 HOST_TSO6 STATUS 
>>>>> CTRL_VQ
>>>>> MQ CTRL_MAC_ADDR VERSION_1 RING_PACKED
>>>> Right and this is probably better to be done at node bootstrapping for
>>>> the management to know about the cluster.
>>> Exactly. That's what mgmt software is supposed to do typically.
>> I think this could apply to both mgmt devices and vDPA devices:
>> 1)mgmt device, see whether the mgmt device is capable to create a 
>> vDPA device with a certain feature bits, this is for LM
>> 2)vDPA device, report the device features, it is for normal operation
> Can you elaborate the use case "for normal operations"? Then it has 
> nothing to do with LM for sure, correct?
like when you just want to check the features to pick a proper device
>
> Noted for the LM case, just as Jason indicated, it's not even 
> *required* for the mgmt software to gather the device features through 
> "vdpa dev show" on source host *alive* right before live migration is 
> started. Depending on the way how it is implemented, the mgmt software 
> can well collect device capability on boot strap time, or may well 
> save the vdpa device capability/config in persistent store ahead of 
> time, say before any VM is to be launched. Then with all such info 
> collected for each cluster node, mgmt software is able to get its own 
> way to infer and sort out the live migration compatibility between 
> nodes. I'm not sure which case you would need to check the device 
> features, but in case you need it, it'd be better live in "vdpa dev 
> show" than "vdpa dev config show".
it is not only for LM
>
> Thanks,
> -Siwei
>
>>
>> Thanks,
>> Zhu Lingshan
>>>
>>> Thanks,
>>> -Siwei
>>>
>>>>
>>>> Thanks
>>>>
>>>>>>>
>>>>>>>> For it to work, you'd want to pass "struct vdpa_dev_set_config" to
>>>>>>>> _vdpa_register_device() during registration, and get it saved 
>>>>>>>> there
>>>>>>>> in "struct vdpa_device". Then in vdpa_dev_fill() show each field
>>>>>>>> conditionally subject to "struct vdpa_dev_set_config.mask".
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -Siwei
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Zhu Lingshan
>>>>>>>>>>>>> Nope:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>>>>>>>>
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> For optional configuration space fields, the driver MUST 
>>>>>>>>>>>>> check
>>>>>>>>>>>>> that the corresponding feature is offered
>>>>>>>>>>>>> before accessing that part of the configuration space.
>>>>>>>>>>>> and how many driver bugs taking wrong assumption of the 
>>>>>>>>>>>> validity
>>>>>>>>>>>> of config space field without features_ok. I am not sure what
>>>>>>>>>>>> use case you want to expose config resister values for before
>>>>>>>>>>>> features_ok, if it's mostly for live migration I guess it's
>>>>>>>>>>>> probably heading a wrong direction.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Last but not the least, this "vdpa dev config" command 
>>>>>>>>>>>>>> was not
>>>>>>>>>>>>>> designed to display the real config space register values in
>>>>>>>>>>>>>> the first place. Quoting the vdpa-dev(8) man page:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> vdpa dev config show - Show configuration of specific 
>>>>>>>>>>>>>>> device
>>>>>>>>>>>>>>> or all devices.
>>>>>>>>>>>>>>> DEV - specifies the vdpa device to show its 
>>>>>>>>>>>>>>> configuration. If
>>>>>>>>>>>>>>> this argument is omitted all devices configuration is 
>>>>>>>>>>>>>>> listed.
>>>>>>>>>>>>>> It doesn't say anything about configuration space or 
>>>>>>>>>>>>>> register
>>>>>>>>>>>>>> values in config space. As long as it can convey the config
>>>>>>>>>>>>>> attribute when instantiating vDPA device instance, and more
>>>>>>>>>>>>>> importantly, the config can be easily imported from or
>>>>>>>>>>>>>> exported to userspace tools when trying to reconstruct vdpa
>>>>>>>>>>>>>> instance intact on destination host for live migration, IMHO
>>>>>>>>>>>>>> in my personal interpretation it doesn't matter what the
>>>>>>>>>>>>>> config space may present. It may be worth while adding a new
>>>>>>>>>>>>>> debug command to expose the real register value, but that's
>>>>>>>>>>>>>> another story.
>>>>>>>>>>>>> I am not sure getting your points. vDPA now reports device
>>>>>>>>>>>>> feature bits(device_features) and negotiated feature
>>>>>>>>>>>>> bits(driver_features), and yes, the drivers features can be a
>>>>>>>>>>>>> subset of the device features; and the vDPA device 
>>>>>>>>>>>>> features can
>>>>>>>>>>>>> be a subset of the management device features.
>>>>>>>>>>>> What I said is after unblocking the conditional check, you'd
>>>>>>>>>>>> have to handle the case for each of the vdpa attribute when
>>>>>>>>>>>> feature negotiation is not yet done: basically the register
>>>>>>>>>>>> values you got from config space via the
>>>>>>>>>>>> vdpa_get_config_unlocked() call is not considered to be valid
>>>>>>>>>>>> before features_ok (per-spec). Although in some case you 
>>>>>>>>>>>> may get
>>>>>>>>>>>> sane value, such behavior is generally undefined. If you 
>>>>>>>>>>>> desire
>>>>>>>>>>>> to show just the device_features alone without any config 
>>>>>>>>>>>> space
>>>>>>>>>>>> field, which the device had advertised *before feature
>>>>>>>>>>>> negotiation is complete*, that'll be fine. But looks to me 
>>>>>>>>>>>> this
>>>>>>>>>>>> is not how patch has been implemented. Probably need some more
>>>>>>>>>>>> work?
>>>>>>>>>>> They are driver_features(negotiated) and the
>>>>>>>>>>> device_features(which comes with the device), and the config
>>>>>>>>>>> space fields that depend on them. In this series, we report 
>>>>>>>>>>> both
>>>>>>>>>>> to the userspace.
>>>>>>>>>> I fail to understand what you want to present from your
>>>>>>>>>> description. May be worth showing some example outputs that at
>>>>>>>>>> least include the following cases: 1) when device offers 
>>>>>>>>>> features
>>>>>>>>>> but not yet acknowledge by guest 2) when guest acknowledged
>>>>>>>>>> features and device is yet to accept 3) after guest feature
>>>>>>>>>> negotiation is completed (agreed upon between guest and device).
>>>>>>>>> Only two feature sets: 1) what the device has. (2) what is 
>>>>>>>>> negotiated
>>>>>>>>>> Thanks,
>>>>>>>>>> -Siwei
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>
>>>>>>>>>>>>>> Having said, please consider to drop the Fixes tag, as 
>>>>>>>>>>>>>> appears
>>>>>>>>>>>>>> to me you're proposing a new feature rather than fixing a 
>>>>>>>>>>>>>> real
>>>>>>>>>>>>>> issue.
>>>>>>>>>>>>> it's a new feature to report the device feature bits than 
>>>>>>>>>>>>> only
>>>>>>>>>>>>> negotiated features, however this patch is a must, or it will
>>>>>>>>>>>>> block the device feature bits reporting. but I agree, the fix
>>>>>>>>>>>>> tag is not a must.
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> -Siwei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 7/1/2022 3:12 PM, Parav Pandit via Virtualization wrote:
>>>>>>>>>>>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Users may want to query the config space of a vDPA device,
>>>>>>>>>>>>>>>> to choose a
>>>>>>>>>>>>>>>> appropriate one for a certain guest. This means the users
>>>>>>>>>>>>>>>> need to read the
>>>>>>>>>>>>>>>> config space before FEATURES_OK, and the existence of 
>>>>>>>>>>>>>>>> config
>>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>>> contents does not depend on FEATURES_OK.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The spec says:
>>>>>>>>>>>>>>>> The device MUST allow reading of any device-specific
>>>>>>>>>>>>>>>> configuration field
>>>>>>>>>>>>>>>> before FEATURES_OK is set by the driver. This includes
>>>>>>>>>>>>>>>> fields which are
>>>>>>>>>>>>>>>> conditional on feature bits, as long as those feature bits
>>>>>>>>>>>>>>>> are offered by the
>>>>>>>>>>>>>>>> device.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a07 (vdpa: Read device configuration only
>>>>>>>>>>>>>>>> if FEATURES_OK)
>>>>>>>>>>>>>>> Fix is fine, but fixes tag needs correction described 
>>>>>>>>>>>>>>> below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Above commit id is 13 letters should be 12.
>>>>>>>>>>>>>>> And
>>>>>>>>>>>>>>> It should be in format
>>>>>>>>>>>>>>> Fixes: 30ef7a8ac8a0 ("vdpa: Read device configuration 
>>>>>>>>>>>>>>> only if
>>>>>>>>>>>>>>> FEATURES_OK")
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please use checkpatch.pl script before posting the 
>>>>>>>>>>>>>>> patches to
>>>>>>>>>>>>>>> catch these errors.
>>>>>>>>>>>>>>> There is a bot that looks at the fixes tag and 
>>>>>>>>>>>>>>> identifies the
>>>>>>>>>>>>>>> right kernel version to apply this fix.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Signed-off-by: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>    drivers/vdpa/vdpa.c | 8 --------
>>>>>>>>>>>>>>>>    1 file changed, 8 deletions(-)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c 
>>>>>>>>>>>>>>>> index
>>>>>>>>>>>>>>>> 9b0e39b2f022..d76b22b2f7ae 100644
>>>>>>>>>>>>>>>> --- a/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>>> +++ b/drivers/vdpa/vdpa.c
>>>>>>>>>>>>>>>> @@ -851,17 +851,9 @@ vdpa_dev_config_fill(struct 
>>>>>>>>>>>>>>>> vdpa_device
>>>>>>>>>>>>>>>> *vdev,
>>>>>>>>>>>>>>>> struct sk_buff *msg, u32 portid, {
>>>>>>>>>>>>>>>>        u32 device_id;
>>>>>>>>>>>>>>>>        void *hdr;
>>>>>>>>>>>>>>>> -    u8 status;
>>>>>>>>>>>>>>>>        int err;
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> down_read(&vdev->cf_lock);
>>>>>>>>>>>>>>>> -    status = vdev->config->get_status(vdev);
>>>>>>>>>>>>>>>> -    if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
>>>>>>>>>>>>>>>> - NL_SET_ERR_MSG_MOD(extack, "Features negotiation not
>>>>>>>>>>>>>>>> completed");
>>>>>>>>>>>>>>>> -        err = -EAGAIN;
>>>>>>>>>>>>>>>> -        goto out;
>>>>>>>>>>>>>>>> -    }
>>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>>>        hdr = genlmsg_put(msg, portid, seq, 
>>>>>>>>>>>>>>>> &vdpa_nl_family,
>>>>>>>>>>>>>>>> flags,
>>>>>>>>>>>>>>>> VDPA_CMD_DEV_CONFIG_GET);
>>>>>>>>>>>>>>>>        if (!hdr) {
>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>> 2.31.1
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Virtualization mailing list
>>>>>>>>>>>>>>> Virtualization@lists.linux-foundation.org
>>>>>>>>>>>>>>> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/virtualization__;!!ACWV5N9M2RV99hQ!NzOv5Ew_Z2CP-zHyD7RsUoStLZ54KpB21QyuZ8L63YVPLEGDEwvcOSDlIGxQPHY-DMkOa9sKKZdBSaNknMU$ 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>
>>
>
diff mbox series

Patch

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index 9b0e39b2f022..d76b22b2f7ae 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -851,17 +851,9 @@  vdpa_dev_config_fill(struct vdpa_device *vdev, struct sk_buff *msg, u32 portid,
 {
 	u32 device_id;
 	void *hdr;
-	u8 status;
 	int err;
 
 	down_read(&vdev->cf_lock);
-	status = vdev->config->get_status(vdev);
-	if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
-		NL_SET_ERR_MSG_MOD(extack, "Features negotiation not completed");
-		err = -EAGAIN;
-		goto out;
-	}
-
 	hdr = genlmsg_put(msg, portid, seq, &vdpa_nl_family, flags,
 			  VDPA_CMD_DEV_CONFIG_GET);
 	if (!hdr) {