Message ID | 20240229091152.56664-1-shameerali.kolothum.thodi@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hisi_acc_vfio_pci: Remove the deferred_reset logic | expand |
On 2/29/2024 1:11 AM, Shameer Kolothum wrote: > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > The deferred_reset logic was added to vfio migration drivers to prevent > a circular locking dependency with respect to mm_lock and state mutex. > This is mainly because of the copy_to/from_user() functions(which takes > mm_lock) invoked under state mutex. But for HiSilicon driver, the only > place where we now hold the state mutex for copy_to_user is during the > PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have > updated the data and perform copy_to_user without state mutex. By this, > we can get rid of the deferred_reset logic. > > Link: https://lore.kernel.org/kvm/20240220132459.GM13330@nvidia.com/ > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Shameer, Thanks for providing this example. After seeing this, it probably doens't make sense to accept my 2/2 patch at https://lore.kernel.org/kvm/20240228003205.47311-3-brett.creeley@amd.com/. I have reworked that patch and am currently doing some testing with it to make sure it's functional. Once I have some results I will send a v3. Thanks, Brett > --- > .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 48 +++++-------------- > .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 6 +-- > 2 files changed, 14 insertions(+), 40 deletions(-) > > diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c > index 4d27465c8f1a..9a3e97108ace 100644 > --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c > +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c > @@ -630,25 +630,11 @@ static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vde > } > } > > -/* > - * This function is called in all state_mutex unlock cases to > - * handle a 'deferred_reset' if exists. > - */ > -static void > -hisi_acc_vf_state_mutex_unlock(struct hisi_acc_vf_core_device *hisi_acc_vdev) > +static void hisi_acc_vf_reset(struct hisi_acc_vf_core_device *hisi_acc_vdev) > { > -again: > - spin_lock(&hisi_acc_vdev->reset_lock); > - if (hisi_acc_vdev->deferred_reset) { > - hisi_acc_vdev->deferred_reset = false; > - spin_unlock(&hisi_acc_vdev->reset_lock); > - hisi_acc_vdev->vf_qm_state = QM_NOT_READY; > - hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; > - hisi_acc_vf_disable_fds(hisi_acc_vdev); > - goto again; > - } > - mutex_unlock(&hisi_acc_vdev->state_mutex); > - spin_unlock(&hisi_acc_vdev->reset_lock); > + hisi_acc_vdev->vf_qm_state = QM_NOT_READY; > + hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; > + hisi_acc_vf_disable_fds(hisi_acc_vdev); > } > > static void hisi_acc_vf_start_device(struct hisi_acc_vf_core_device *hisi_acc_vdev) > @@ -804,8 +790,10 @@ static long hisi_acc_vf_precopy_ioctl(struct file *filp, > > info.dirty_bytes = 0; > info.initial_bytes = migf->total_length - *pos; > + mutex_unlock(&migf->lock); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > > - ret = copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; > + return copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; > out: > mutex_unlock(&migf->lock); > mutex_unlock(&hisi_acc_vdev->state_mutex); > @@ -1071,7 +1059,7 @@ hisi_acc_vfio_pci_set_device_state(struct vfio_device *vdev, > break; > } > } > - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > return res; > } > > @@ -1092,7 +1080,7 @@ hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev, > > mutex_lock(&hisi_acc_vdev->state_mutex); > *curr_state = hisi_acc_vdev->mig_state; > - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > return 0; > } > > @@ -1104,21 +1092,9 @@ static void hisi_acc_vf_pci_aer_reset_done(struct pci_dev *pdev) > VFIO_MIGRATION_STOP_COPY) > return; > > - /* > - * As the higher VFIO layers are holding locks across reset and using > - * those same locks with the mm_lock we need to prevent ABBA deadlock > - * with the state_mutex and mm_lock. > - * In case the state_mutex was taken already we defer the cleanup work > - * to the unlock flow of the other running context. > - */ > - spin_lock(&hisi_acc_vdev->reset_lock); > - hisi_acc_vdev->deferred_reset = true; > - if (!mutex_trylock(&hisi_acc_vdev->state_mutex)) { > - spin_unlock(&hisi_acc_vdev->reset_lock); > - return; > - } > - spin_unlock(&hisi_acc_vdev->reset_lock); > - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); > + mutex_lock(&hisi_acc_vdev->state_mutex); > + hisi_acc_vf_reset(hisi_acc_vdev); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > } > > static int hisi_acc_vf_qm_init(struct hisi_acc_vf_core_device *hisi_acc_vdev) > diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h > index dcabfeec6ca1..5bab46602fad 100644 > --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h > +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h > @@ -98,8 +98,8 @@ struct hisi_acc_vf_migration_file { > > struct hisi_acc_vf_core_device { > struct vfio_pci_core_device core_device; > - u8 match_done:1; > - u8 deferred_reset:1; > + u8 match_done; > + > /* For migration state */ > struct mutex state_mutex; > enum vfio_device_mig_state mig_state; > @@ -109,8 +109,6 @@ struct hisi_acc_vf_core_device { > struct hisi_qm vf_qm; > u32 vf_qm_state; > int vf_id; > - /* For reset handler */ > - spinlock_t reset_lock; > struct hisi_acc_vf_migration_file *resuming_migf; > struct hisi_acc_vf_migration_file *saving_migf; > }; > -- > 2.34.1 >
On 2/29/2024 1:11 AM, Shameer Kolothum wrote: > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > The deferred_reset logic was added to vfio migration drivers to prevent > a circular locking dependency with respect to mm_lock and state mutex. > This is mainly because of the copy_to/from_user() functions(which takes > mm_lock) invoked under state mutex. But for HiSilicon driver, the only > place where we now hold the state mutex for copy_to_user is during the > PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have > updated the data and perform copy_to_user without state mutex. By this, > we can get rid of the deferred_reset logic. > > Link: https://lore.kernel.org/kvm/20240220132459.GM13330@nvidia.com/ > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> > --- > .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 48 +++++-------------- > .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 6 +-- > 2 files changed, 14 insertions(+), 40 deletions(-) > > diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c > index 4d27465c8f1a..9a3e97108ace 100644 > --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c > +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c > @@ -630,25 +630,11 @@ static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vde > } > } > > -/* > - * This function is called in all state_mutex unlock cases to > - * handle a 'deferred_reset' if exists. > - */ > -static void > -hisi_acc_vf_state_mutex_unlock(struct hisi_acc_vf_core_device *hisi_acc_vdev) > +static void hisi_acc_vf_reset(struct hisi_acc_vf_core_device *hisi_acc_vdev) > { > -again: > - spin_lock(&hisi_acc_vdev->reset_lock); > - if (hisi_acc_vdev->deferred_reset) { > - hisi_acc_vdev->deferred_reset = false; > - spin_unlock(&hisi_acc_vdev->reset_lock); > - hisi_acc_vdev->vf_qm_state = QM_NOT_READY; > - hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; > - hisi_acc_vf_disable_fds(hisi_acc_vdev); > - goto again; > - } > - mutex_unlock(&hisi_acc_vdev->state_mutex); > - spin_unlock(&hisi_acc_vdev->reset_lock); > + hisi_acc_vdev->vf_qm_state = QM_NOT_READY; > + hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; > + hisi_acc_vf_disable_fds(hisi_acc_vdev); > } > > static void hisi_acc_vf_start_device(struct hisi_acc_vf_core_device *hisi_acc_vdev) > @@ -804,8 +790,10 @@ static long hisi_acc_vf_precopy_ioctl(struct file *filp, > > info.dirty_bytes = 0; > info.initial_bytes = migf->total_length - *pos; > + mutex_unlock(&migf->lock); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > > - ret = copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; > + return copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; > out: > mutex_unlock(&migf->lock); > mutex_unlock(&hisi_acc_vdev->state_mutex); > @@ -1071,7 +1059,7 @@ hisi_acc_vfio_pci_set_device_state(struct vfio_device *vdev, > break; > } > } > - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > return res; > } > > @@ -1092,7 +1080,7 @@ hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev, > > mutex_lock(&hisi_acc_vdev->state_mutex); > *curr_state = hisi_acc_vdev->mig_state; > - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > return 0; > } > > @@ -1104,21 +1092,9 @@ static void hisi_acc_vf_pci_aer_reset_done(struct pci_dev *pdev) > VFIO_MIGRATION_STOP_COPY) > return; > > - /* > - * As the higher VFIO layers are holding locks across reset and using > - * those same locks with the mm_lock we need to prevent ABBA deadlock > - * with the state_mutex and mm_lock. > - * In case the state_mutex was taken already we defer the cleanup work > - * to the unlock flow of the other running context. > - */ > - spin_lock(&hisi_acc_vdev->reset_lock); > - hisi_acc_vdev->deferred_reset = true; > - if (!mutex_trylock(&hisi_acc_vdev->state_mutex)) { > - spin_unlock(&hisi_acc_vdev->reset_lock); > - return; > - } > - spin_unlock(&hisi_acc_vdev->reset_lock); > - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); > + mutex_lock(&hisi_acc_vdev->state_mutex); > + hisi_acc_vf_reset(hisi_acc_vdev); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > } > > static int hisi_acc_vf_qm_init(struct hisi_acc_vf_core_device *hisi_acc_vdev) > diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h > index dcabfeec6ca1..5bab46602fad 100644 > --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h > +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h > @@ -98,8 +98,8 @@ struct hisi_acc_vf_migration_file { > > struct hisi_acc_vf_core_device { > struct vfio_pci_core_device core_device; > - u8 match_done:1; > - u8 deferred_reset:1; > + u8 match_done; > + > /* For migration state */ > struct mutex state_mutex; > enum vfio_device_mig_state mig_state; > @@ -109,8 +109,6 @@ struct hisi_acc_vf_core_device { > struct hisi_qm vf_qm; > u32 vf_qm_state; > int vf_id; > - /* For reset handler */ > - spinlock_t reset_lock; > struct hisi_acc_vf_migration_file *resuming_migf; > struct hisi_acc_vf_migration_file *saving_migf; > }; > -- > 2.34.1 > LGTM. Thanks again for the example. FWIW: Reviewed-by: Brett Creeley <brett.creeley@amd.com>
> From: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> > Sent: Thursday, February 29, 2024 5:12 PM > > The deferred_reset logic was added to vfio migration drivers to prevent > a circular locking dependency with respect to mm_lock and state mutex. > This is mainly because of the copy_to/from_user() functions(which takes > mm_lock) invoked under state mutex. But for HiSilicon driver, the only > place where we now hold the state mutex for copy_to_user is during the > PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have > updated the data and perform copy_to_user without state mutex. By this, > we can get rid of the deferred_reset logic. > > Link: https://lore.kernel.org/kvm/20240220132459.GM13330@nvidia.com/ > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
On Thu, 29 Feb 2024 14:05:58 -0800 Brett Creeley <bcreeley@amd.com> wrote: > On 2/29/2024 1:11 AM, Shameer Kolothum wrote: > > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > > > > The deferred_reset logic was added to vfio migration drivers to prevent > > a circular locking dependency with respect to mm_lock and state mutex. > > This is mainly because of the copy_to/from_user() functions(which takes > > mm_lock) invoked under state mutex. But for HiSilicon driver, the only > > place where we now hold the state mutex for copy_to_user is during the > > PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have > > updated the data and perform copy_to_user without state mutex. By this, > > we can get rid of the deferred_reset logic. > > > > Link: https://lore.kernel.org/kvm/20240220132459.GM13330@nvidia.com/ > > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> > > Shameer, > > Thanks for providing this example. After seeing this, it probably > doens't make sense to accept my 2/2 patch at > https://lore.kernel.org/kvm/20240228003205.47311-3-brett.creeley@amd.com/. > > I have reworked that patch and am currently doing some testing with it > to make sure it's functional. Once I have some results I will send a v3. Darn, somehow this thread snuck by me last week. Currently your series is at the top of my next branch, so I'll just rebase it to 8512ed256334 ("vfio/pds: Always clear the save/restore FDs on reset") to drop your 2/2 and wait for something new relative to the reset logic. Thanks, Alex
On 3/4/2024 12:27 PM, Alex Williamson wrote: > Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. > > > On Thu, 29 Feb 2024 14:05:58 -0800 > Brett Creeley <bcreeley@amd.com> wrote: > >> On 2/29/2024 1:11 AM, Shameer Kolothum wrote: >>> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. >>> >>> >>> The deferred_reset logic was added to vfio migration drivers to prevent >>> a circular locking dependency with respect to mm_lock and state mutex. >>> This is mainly because of the copy_to/from_user() functions(which takes >>> mm_lock) invoked under state mutex. But for HiSilicon driver, the only >>> place where we now hold the state mutex for copy_to_user is during the >>> PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have >>> updated the data and perform copy_to_user without state mutex. By this, >>> we can get rid of the deferred_reset logic. >>> >>> Link: https://lore.kernel.org/kvm/20240220132459.GM13330@nvidia.com/ >>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> >> >> Shameer, >> >> Thanks for providing this example. After seeing this, it probably >> doens't make sense to accept my 2/2 patch at >> https://lore.kernel.org/kvm/20240228003205.47311-3-brett.creeley@amd.com/. >> >> I have reworked that patch and am currently doing some testing with it >> to make sure it's functional. Once I have some results I will send a v3. > > Darn, somehow this thread snuck by me last week. Currently your series > is at the top of my next branch, so I'll just rebase it to 8512ed256334 > ("vfio/pds: Always clear the save/restore FDs on reset") to drop your > 2/2 and wait for something new relative to the reset logic. Thanks, > > Alex That works for me. Thanks, Brett >
On Thu, Feb 29, 2024 at 09:11:52AM +0000, Shameer Kolothum wrote: > The deferred_reset logic was added to vfio migration drivers to prevent > a circular locking dependency with respect to mm_lock and state mutex. > This is mainly because of the copy_to/from_user() functions(which takes > mm_lock) invoked under state mutex. But for HiSilicon driver, the only > place where we now hold the state mutex for copy_to_user is during the > PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have > updated the data and perform copy_to_user without state mutex. By this, > we can get rid of the deferred_reset logic. > > Link: https://lore.kernel.org/kvm/20240220132459.GM13330@nvidia.com/ > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> > --- > .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 48 +++++-------------- > .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 6 +-- > 2 files changed, 14 insertions(+), 40 deletions(-) Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Jason
On Thu, 29 Feb 2024 09:11:52 +0000 Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote: > The deferred_reset logic was added to vfio migration drivers to prevent > a circular locking dependency with respect to mm_lock and state mutex. > This is mainly because of the copy_to/from_user() functions(which takes > mm_lock) invoked under state mutex. But for HiSilicon driver, the only > place where we now hold the state mutex for copy_to_user is during the > PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have > updated the data and perform copy_to_user without state mutex. By this, > we can get rid of the deferred_reset logic. > > Link: https://lore.kernel.org/kvm/20240220132459.GM13330@nvidia.com/ > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> > --- > .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 48 +++++-------------- > .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 6 +-- > 2 files changed, 14 insertions(+), 40 deletions(-) Applied to vfio next branch for v6.9. Thanks, Alex > > diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c > index 4d27465c8f1a..9a3e97108ace 100644 > --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c > +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c > @@ -630,25 +630,11 @@ static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vde > } > } > > -/* > - * This function is called in all state_mutex unlock cases to > - * handle a 'deferred_reset' if exists. > - */ > -static void > -hisi_acc_vf_state_mutex_unlock(struct hisi_acc_vf_core_device *hisi_acc_vdev) > +static void hisi_acc_vf_reset(struct hisi_acc_vf_core_device *hisi_acc_vdev) > { > -again: > - spin_lock(&hisi_acc_vdev->reset_lock); > - if (hisi_acc_vdev->deferred_reset) { > - hisi_acc_vdev->deferred_reset = false; > - spin_unlock(&hisi_acc_vdev->reset_lock); > - hisi_acc_vdev->vf_qm_state = QM_NOT_READY; > - hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; > - hisi_acc_vf_disable_fds(hisi_acc_vdev); > - goto again; > - } > - mutex_unlock(&hisi_acc_vdev->state_mutex); > - spin_unlock(&hisi_acc_vdev->reset_lock); > + hisi_acc_vdev->vf_qm_state = QM_NOT_READY; > + hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; > + hisi_acc_vf_disable_fds(hisi_acc_vdev); > } > > static void hisi_acc_vf_start_device(struct hisi_acc_vf_core_device *hisi_acc_vdev) > @@ -804,8 +790,10 @@ static long hisi_acc_vf_precopy_ioctl(struct file *filp, > > info.dirty_bytes = 0; > info.initial_bytes = migf->total_length - *pos; > + mutex_unlock(&migf->lock); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > > - ret = copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; > + return copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; > out: > mutex_unlock(&migf->lock); > mutex_unlock(&hisi_acc_vdev->state_mutex); > @@ -1071,7 +1059,7 @@ hisi_acc_vfio_pci_set_device_state(struct vfio_device *vdev, > break; > } > } > - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > return res; > } > > @@ -1092,7 +1080,7 @@ hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev, > > mutex_lock(&hisi_acc_vdev->state_mutex); > *curr_state = hisi_acc_vdev->mig_state; > - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > return 0; > } > > @@ -1104,21 +1092,9 @@ static void hisi_acc_vf_pci_aer_reset_done(struct pci_dev *pdev) > VFIO_MIGRATION_STOP_COPY) > return; > > - /* > - * As the higher VFIO layers are holding locks across reset and using > - * those same locks with the mm_lock we need to prevent ABBA deadlock > - * with the state_mutex and mm_lock. > - * In case the state_mutex was taken already we defer the cleanup work > - * to the unlock flow of the other running context. > - */ > - spin_lock(&hisi_acc_vdev->reset_lock); > - hisi_acc_vdev->deferred_reset = true; > - if (!mutex_trylock(&hisi_acc_vdev->state_mutex)) { > - spin_unlock(&hisi_acc_vdev->reset_lock); > - return; > - } > - spin_unlock(&hisi_acc_vdev->reset_lock); > - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); > + mutex_lock(&hisi_acc_vdev->state_mutex); > + hisi_acc_vf_reset(hisi_acc_vdev); > + mutex_unlock(&hisi_acc_vdev->state_mutex); > } > > static int hisi_acc_vf_qm_init(struct hisi_acc_vf_core_device *hisi_acc_vdev) > diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h > index dcabfeec6ca1..5bab46602fad 100644 > --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h > +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h > @@ -98,8 +98,8 @@ struct hisi_acc_vf_migration_file { > > struct hisi_acc_vf_core_device { > struct vfio_pci_core_device core_device; > - u8 match_done:1; > - u8 deferred_reset:1; > + u8 match_done; > + > /* For migration state */ > struct mutex state_mutex; > enum vfio_device_mig_state mig_state; > @@ -109,8 +109,6 @@ struct hisi_acc_vf_core_device { > struct hisi_qm vf_qm; > u32 vf_qm_state; > int vf_id; > - /* For reset handler */ > - spinlock_t reset_lock; > struct hisi_acc_vf_migration_file *resuming_migf; > struct hisi_acc_vf_migration_file *saving_migf; > };
diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c index 4d27465c8f1a..9a3e97108ace 100644 --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c @@ -630,25 +630,11 @@ static void hisi_acc_vf_disable_fds(struct hisi_acc_vf_core_device *hisi_acc_vde } } -/* - * This function is called in all state_mutex unlock cases to - * handle a 'deferred_reset' if exists. - */ -static void -hisi_acc_vf_state_mutex_unlock(struct hisi_acc_vf_core_device *hisi_acc_vdev) +static void hisi_acc_vf_reset(struct hisi_acc_vf_core_device *hisi_acc_vdev) { -again: - spin_lock(&hisi_acc_vdev->reset_lock); - if (hisi_acc_vdev->deferred_reset) { - hisi_acc_vdev->deferred_reset = false; - spin_unlock(&hisi_acc_vdev->reset_lock); - hisi_acc_vdev->vf_qm_state = QM_NOT_READY; - hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; - hisi_acc_vf_disable_fds(hisi_acc_vdev); - goto again; - } - mutex_unlock(&hisi_acc_vdev->state_mutex); - spin_unlock(&hisi_acc_vdev->reset_lock); + hisi_acc_vdev->vf_qm_state = QM_NOT_READY; + hisi_acc_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING; + hisi_acc_vf_disable_fds(hisi_acc_vdev); } static void hisi_acc_vf_start_device(struct hisi_acc_vf_core_device *hisi_acc_vdev) @@ -804,8 +790,10 @@ static long hisi_acc_vf_precopy_ioctl(struct file *filp, info.dirty_bytes = 0; info.initial_bytes = migf->total_length - *pos; + mutex_unlock(&migf->lock); + mutex_unlock(&hisi_acc_vdev->state_mutex); - ret = copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; + return copy_to_user((void __user *)arg, &info, minsz) ? -EFAULT : 0; out: mutex_unlock(&migf->lock); mutex_unlock(&hisi_acc_vdev->state_mutex); @@ -1071,7 +1059,7 @@ hisi_acc_vfio_pci_set_device_state(struct vfio_device *vdev, break; } } - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); + mutex_unlock(&hisi_acc_vdev->state_mutex); return res; } @@ -1092,7 +1080,7 @@ hisi_acc_vfio_pci_get_device_state(struct vfio_device *vdev, mutex_lock(&hisi_acc_vdev->state_mutex); *curr_state = hisi_acc_vdev->mig_state; - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); + mutex_unlock(&hisi_acc_vdev->state_mutex); return 0; } @@ -1104,21 +1092,9 @@ static void hisi_acc_vf_pci_aer_reset_done(struct pci_dev *pdev) VFIO_MIGRATION_STOP_COPY) return; - /* - * As the higher VFIO layers are holding locks across reset and using - * those same locks with the mm_lock we need to prevent ABBA deadlock - * with the state_mutex and mm_lock. - * In case the state_mutex was taken already we defer the cleanup work - * to the unlock flow of the other running context. - */ - spin_lock(&hisi_acc_vdev->reset_lock); - hisi_acc_vdev->deferred_reset = true; - if (!mutex_trylock(&hisi_acc_vdev->state_mutex)) { - spin_unlock(&hisi_acc_vdev->reset_lock); - return; - } - spin_unlock(&hisi_acc_vdev->reset_lock); - hisi_acc_vf_state_mutex_unlock(hisi_acc_vdev); + mutex_lock(&hisi_acc_vdev->state_mutex); + hisi_acc_vf_reset(hisi_acc_vdev); + mutex_unlock(&hisi_acc_vdev->state_mutex); } static int hisi_acc_vf_qm_init(struct hisi_acc_vf_core_device *hisi_acc_vdev) diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h index dcabfeec6ca1..5bab46602fad 100644 --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h @@ -98,8 +98,8 @@ struct hisi_acc_vf_migration_file { struct hisi_acc_vf_core_device { struct vfio_pci_core_device core_device; - u8 match_done:1; - u8 deferred_reset:1; + u8 match_done; + /* For migration state */ struct mutex state_mutex; enum vfio_device_mig_state mig_state; @@ -109,8 +109,6 @@ struct hisi_acc_vf_core_device { struct hisi_qm vf_qm; u32 vf_qm_state; int vf_id; - /* For reset handler */ - spinlock_t reset_lock; struct hisi_acc_vf_migration_file *resuming_migf; struct hisi_acc_vf_migration_file *saving_migf; };
The deferred_reset logic was added to vfio migration drivers to prevent a circular locking dependency with respect to mm_lock and state mutex. This is mainly because of the copy_to/from_user() functions(which takes mm_lock) invoked under state mutex. But for HiSilicon driver, the only place where we now hold the state mutex for copy_to_user is during the PRE_COPY IOCTL. So for pre_copy, release the lock as soon as we have updated the data and perform copy_to_user without state mutex. By this, we can get rid of the deferred_reset logic. Link: https://lore.kernel.org/kvm/20240220132459.GM13330@nvidia.com/ Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> --- .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 48 +++++-------------- .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h | 6 +-- 2 files changed, 14 insertions(+), 40 deletions(-)