diff mbox series

[3/9] iommu: Add common code to handle IO page faults

Message ID 20230711010642.19707-4-baolu.lu@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series iommu: Prepare to deliver page faults to user space | expand

Commit Message

Baolu Lu July 11, 2023, 1:06 a.m. UTC
The individual iommu drivers report iommu faults by calling
iommu_report_device_fault(), where a pre-registered device fault handler
is called to route the fault to another fault handler installed on the
corresponding iommu domain.

The pre-registered device fault handler is static and won't be dynamic
as the fault handler is eventually per iommu domain. Replace the device
fault handler with a static common code to avoid unnecessary code.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/iommu/iommu.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

Comments

Tian, Kevin July 11, 2023, 6:12 a.m. UTC | #1
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, July 11, 2023 9:07 AM
> 
> +static int iommu_handle_io_pgfault(struct device *dev,
> +				   struct iommu_fault *fault)
> +{
> +	struct iommu_domain *domain;
> +
> +	if (fault->type != IOMMU_FAULT_PAGE_REQ)
> +		return -EINVAL;
> +
> +	if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID)
> +		domain = iommu_get_domain_for_dev_pasid(dev, fault-
> >prm.pasid, 0);
> +	else
> +		domain = iommu_get_domain_for_dev(dev);
> +
> +	if (!domain || !domain->iopf_handler)
> +		return -ENODEV;
> +
> +	if (domain->iopf_handler == iommu_sva_handle_iopf)
> +		return iommu_queue_iopf(fault, dev);

You can avoid the special check by directly making iommu_queue_iopf
as the iopf_handler for sva domain.

> +
> +	return domain->iopf_handler(fault, dev, domain->fault_data);
> +}

btw is there value of moving the group handling logic from
iommu_queue_iopf() to this common function?

I wonder whether there is any correctness issue if not forwarding
partial request to iommufd. If not this can also help reduce
notifications to the user until the group is ready.
Jacob Pan July 11, 2023, 8:50 p.m. UTC | #2
Hi Lu,

On Tue, 11 Jul 2023 09:06:36 +0800, Lu Baolu <baolu.lu@linux.intel.com>
wrote:

> The individual iommu drivers report iommu faults by calling
> iommu_report_device_fault(), where a pre-registered device fault handler
> is called to route the fault to another fault handler installed on the
> corresponding iommu domain.
> 
> The pre-registered device fault handler is static and won't be dynamic
> as the fault handler is eventually per iommu domain. Replace the device
> fault handler with a static common code to avoid unnecessary code.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/iommu/iommu.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index da340f11c5f5..41328f03e8b4 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1278,6 +1278,28 @@ int iommu_unregister_device_fault_handler(struct
> device *dev) }
>  EXPORT_SYMBOL_GPL(iommu_unregister_device_fault_handler);
>  
> +static int iommu_handle_io_pgfault(struct device *dev,
> +				   struct iommu_fault *fault)
> +{
> +	struct iommu_domain *domain;
> +
> +	if (fault->type != IOMMU_FAULT_PAGE_REQ)
> +		return -EINVAL;
> +
> +	if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID)
> +		domain = iommu_get_domain_for_dev_pasid(dev,
> fault->prm.pasid, 0);
> +	else
> +		domain = iommu_get_domain_for_dev(dev);
we don't support IOPF w/o PASID yet, right?

> +
> +	if (!domain || !domain->iopf_handler)
> +		return -ENODEV;
> +
> +	if (domain->iopf_handler == iommu_sva_handle_iopf)
> +		return iommu_queue_iopf(fault, dev);
Just wondering why have a special treatment for SVA domain. Can
iommu_queue_iopf() be used as SVA domain->iopf_handler?

> +
> +	return domain->iopf_handler(fault, dev, domain->fault_data);
> +}
> +
>  /**
>   * iommu_report_device_fault() - Report fault event to device driver
>   * @dev: the device
> @@ -1320,7 +1342,7 @@ int iommu_report_device_fault(struct device *dev,
> struct iommu_fault_event *evt) mutex_unlock(&fparam->lock);
>  	}
>  
> -	ret = fparam->handler(&evt->fault, fparam->data);
> +	ret = iommu_handle_io_pgfault(dev, &evt->fault);
>  	if (ret && evt_pending) {
>  		mutex_lock(&fparam->lock);
>  		list_del(&evt_pending->list);


Thanks,

Jacob
Baolu Lu July 12, 2023, 2:32 a.m. UTC | #3
On 2023/7/11 14:12, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Tuesday, July 11, 2023 9:07 AM
>>
>> +static int iommu_handle_io_pgfault(struct device *dev,
>> +				   struct iommu_fault *fault)
>> +{
>> +	struct iommu_domain *domain;
>> +
>> +	if (fault->type != IOMMU_FAULT_PAGE_REQ)
>> +		return -EINVAL;
>> +
>> +	if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID)
>> +		domain = iommu_get_domain_for_dev_pasid(dev, fault-
>>> prm.pasid, 0);
>> +	else
>> +		domain = iommu_get_domain_for_dev(dev);
>> +
>> +	if (!domain || !domain->iopf_handler)
>> +		return -ENODEV;
>> +
>> +	if (domain->iopf_handler == iommu_sva_handle_iopf)
>> +		return iommu_queue_iopf(fault, dev);
> 
> You can avoid the special check by directly making iommu_queue_iopf
> as the iopf_handler for sva domain.

Yeah, good catch!

> 
>> +
>> +	return domain->iopf_handler(fault, dev, domain->fault_data);
>> +}
> 
> btw is there value of moving the group handling logic from
> iommu_queue_iopf() to this common function?
> 
> I wonder whether there is any correctness issue if not forwarding
> partial request to iommufd. If not this can also help reduce
> notifications to the user until the group is ready.

I don't think there's any correctness issue. But it should be better if
we can inject the page faults to vm guests as soon as possible. There's
no requirement to put page requests to vIOMMU's hardware page request
queue at the granularity of a fault group. Thoughts?

Best regards,
baolu
Baolu Lu July 12, 2023, 2:37 a.m. UTC | #4
On 2023/7/12 4:50, Jacob Pan wrote:
> Hi Lu,
> 
> On Tue, 11 Jul 2023 09:06:36 +0800, Lu Baolu <baolu.lu@linux.intel.com>
> wrote:
> 
>> The individual iommu drivers report iommu faults by calling
>> iommu_report_device_fault(), where a pre-registered device fault handler
>> is called to route the fault to another fault handler installed on the
>> corresponding iommu domain.
>>
>> The pre-registered device fault handler is static and won't be dynamic
>> as the fault handler is eventually per iommu domain. Replace the device
>> fault handler with a static common code to avoid unnecessary code.
>>
>> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
>> ---
>>   drivers/iommu/iommu.c | 24 +++++++++++++++++++++++-
>>   1 file changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index da340f11c5f5..41328f03e8b4 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -1278,6 +1278,28 @@ int iommu_unregister_device_fault_handler(struct
>> device *dev) }
>>   EXPORT_SYMBOL_GPL(iommu_unregister_device_fault_handler);
>>   
>> +static int iommu_handle_io_pgfault(struct device *dev,
>> +				   struct iommu_fault *fault)
>> +{
>> +	struct iommu_domain *domain;
>> +
>> +	if (fault->type != IOMMU_FAULT_PAGE_REQ)
>> +		return -EINVAL;
>> +
>> +	if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID)
>> +		domain = iommu_get_domain_for_dev_pasid(dev,
>> fault->prm.pasid, 0);
>> +	else
>> +		domain = iommu_get_domain_for_dev(dev);
> we don't support IOPF w/o PASID yet, right?

It's the individual driver that decides whether iopf w/o pasid is
supported or not. The iommu core doesn't need to make such assumption.

> 
>> +
>> +	if (!domain || !domain->iopf_handler)
>> +		return -ENODEV;
>> +
>> +	if (domain->iopf_handler == iommu_sva_handle_iopf)
>> +		return iommu_queue_iopf(fault, dev);
> Just wondering why have a special treatment for SVA domain. Can
> iommu_queue_iopf() be used as SVA domain->iopf_handler?

Yes. I will make this change according to Kevin's suggestion in this
thread.

> 
>> +
>> +	return domain->iopf_handler(fault, dev, domain->fault_data);
>> +}
>> +
>>   /**
>>    * iommu_report_device_fault() - Report fault event to device driver
>>    * @dev: the device
>> @@ -1320,7 +1342,7 @@ int iommu_report_device_fault(struct device *dev,
>> struct iommu_fault_event *evt) mutex_unlock(&fparam->lock);
>>   	}
>>   
>> -	ret = fparam->handler(&evt->fault, fparam->data);
>> +	ret = iommu_handle_io_pgfault(dev, &evt->fault);
>>   	if (ret && evt_pending) {
>>   		mutex_lock(&fparam->lock);
>>   		list_del(&evt_pending->list);
> 
> 
> Thanks,
> 
> Jacob
> 

Best regards,
baolu
Jean-Philippe Brucker July 12, 2023, 9:45 a.m. UTC | #5
On Wed, Jul 12, 2023 at 10:32:13AM +0800, Baolu Lu wrote:
> > btw is there value of moving the group handling logic from
> > iommu_queue_iopf() to this common function?
> > 
> > I wonder whether there is any correctness issue if not forwarding
> > partial request to iommufd. If not this can also help reduce
> > notifications to the user until the group is ready.
> 
> I don't think there's any correctness issue. But it should be better if
> we can inject the page faults to vm guests as soon as possible. There's
> no requirement to put page requests to vIOMMU's hardware page request
> queue at the granularity of a fault group. Thoughts?

Not sure I understand you correctly, but we can't inject partial fault
groups: if the HW PRI queue overflows, the last fault in a group may be
lost, so the non-last faults in that group already injected won't be
completed (until PRGI reuse), leaking PRI request credits and guest
resources.

Thanks,
Jean
Baolu Lu July 13, 2023, 4:02 a.m. UTC | #6
On 2023/7/12 17:45, Jean-Philippe Brucker wrote:
> On Wed, Jul 12, 2023 at 10:32:13AM +0800, Baolu Lu wrote:
>>> btw is there value of moving the group handling logic from
>>> iommu_queue_iopf() to this common function?
>>>
>>> I wonder whether there is any correctness issue if not forwarding
>>> partial request to iommufd. If not this can also help reduce
>>> notifications to the user until the group is ready.
>>
>> I don't think there's any correctness issue. But it should be better if
>> we can inject the page faults to vm guests as soon as possible. There's
>> no requirement to put page requests to vIOMMU's hardware page request
>> queue at the granularity of a fault group. Thoughts?
> 
> Not sure I understand you correctly, but we can't inject partial fault
> groups: if the HW PRI queue overflows, the last fault in a group may be
> lost, so the non-last faults in that group already injected won't be
> completed (until PRGI reuse), leaking PRI request credits and guest
> resources.

Yeah, that's how vIOMMU injects the faults. On host/hypervisor side, my
understanding is that faults should be uploaded as soon as possible.

Best regards,
baolu
diff mbox series

Patch

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index da340f11c5f5..41328f03e8b4 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1278,6 +1278,28 @@  int iommu_unregister_device_fault_handler(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_unregister_device_fault_handler);
 
+static int iommu_handle_io_pgfault(struct device *dev,
+				   struct iommu_fault *fault)
+{
+	struct iommu_domain *domain;
+
+	if (fault->type != IOMMU_FAULT_PAGE_REQ)
+		return -EINVAL;
+
+	if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID)
+		domain = iommu_get_domain_for_dev_pasid(dev, fault->prm.pasid, 0);
+	else
+		domain = iommu_get_domain_for_dev(dev);
+
+	if (!domain || !domain->iopf_handler)
+		return -ENODEV;
+
+	if (domain->iopf_handler == iommu_sva_handle_iopf)
+		return iommu_queue_iopf(fault, dev);
+
+	return domain->iopf_handler(fault, dev, domain->fault_data);
+}
+
 /**
  * iommu_report_device_fault() - Report fault event to device driver
  * @dev: the device
@@ -1320,7 +1342,7 @@  int iommu_report_device_fault(struct device *dev, struct iommu_fault_event *evt)
 		mutex_unlock(&fparam->lock);
 	}
 
-	ret = fparam->handler(&evt->fault, fparam->data);
+	ret = iommu_handle_io_pgfault(dev, &evt->fault);
 	if (ret && evt_pending) {
 		mutex_lock(&fparam->lock);
 		list_del(&evt_pending->list);