mbox series

[0/5] hyper-v: Don't assume cpu_possible_mask is dense

Message ID 20241003035333.49261-1-mhklinux@outlook.com (mailing list archive)
Headers show
Series hyper-v: Don't assume cpu_possible_mask is dense | expand

Message

Michael Kelley Oct. 3, 2024, 3:53 a.m. UTC
From: Michael Kelley <mhklinux@outlook.com>

Code specific to Hyper-V guests currently assumes the cpu_possible_mask
is "dense" -- i.e., all bit positions 0 thru (nr_cpu_ids - 1) are set,
with no "holes". Therefore, num_possible_cpus() is assumed to be equal
to nr_cpu_ids.

Per a separate discussion[1], this assumption is not valid in the
general case. For example, the function setup_nr_cpu_ids() in
kernel/smp.c is coded to assume cpu_possible_mask may be sparse,
and other patches have been made in the past to correctly handle
the sparseness. See bc75e99983df1efd ("rcu: Correctly handle sparse
possible cpu") as noted by Mark Rutland.

The general case notwithstanding, the configurations that Hyper-V
provides to guest VMs on x86 and ARM64 hardware, in combination
with the algorithms currently used by architecture specific code
to assign Linux CPU numbers, *does* always produce a dense
cpu_possible_mask. So the invalid assumption is not currently
causing failures. But in the interest of correctness, and robustness
against future changes in the code that populates cpu_possible_mask,
update the Hyper-V code to no longer assume denseness.

The typical code pattern with the invalid assumption is as follows:

	array = kcalloc(num_possible_cpus(), sizeof(<some struct>),
			GFP_KERNEL);
	....
	index into "array" with smp_processor_id()

In such as case, the array might be indexed by a value beyond the size
of the array. The correct approach is to allocate the array with size
"nr_cpu_ids". While this will probably leave unused any array entries
corresponding to holes in cpu_possible_mask, the holes are assumed to
be minimal and hence the amount of memory wasted by unused entries is
minimal.

Removing the assumption in Hyper-V code is done in several patches
because they touch different kernel subsystems:

Patch 1: Hyper-V x86 initialization of hv_vp_assist_page (there's no
	 hv_vp_assist_page on ARM64)
Patch 2: Hyper-V common init of hv_vp_index
Patch 3: Hyper-V IOMMU driver
Patch 4: storvsc driver
Patch 5: netvsc driver

I tested the changes by hacking the construction of cpu_possible_mask
to include a hole on x86. With a configuration set to demonstrate the
problem, a Hyper-V guest kernel eventually crashes due to memory
corruption. After the patches in this series, the crash does not occur.

[1] https://lore.kernel.org/lkml/SN6PR02MB4157210CC36B2593F8572E5ED4692@SN6PR02MB4157.namprd02.prod.outlook.com/

Michael Kelley (5):
  x86/hyperv: Don't assume cpu_possible_mask is dense
  Drivers: hv: Don't assume cpu_possible_mask is dense
  iommu/hyper-v: Don't assume cpu_possible_mask is dense
  scsi: storvsc: Don't assume cpu_possible_mask is dense
  hv_netvsc: Don't assume cpu_possible_mask is dense

 arch/x86/hyperv/hv_init.c       |  2 +-
 drivers/hv/hv_common.c          |  4 ++--
 drivers/iommu/hyperv-iommu.c    |  4 ++--
 drivers/net/hyperv/netvsc_drv.c |  2 +-
 drivers/scsi/storvsc_drv.c      | 13 ++++++-------
 5 files changed, 12 insertions(+), 13 deletions(-)

Comments

Peter Zijlstra Oct. 4, 2024, 10:07 a.m. UTC | #1
On Wed, Oct 02, 2024 at 08:53:28PM -0700, mhkelley58@gmail.com wrote:
> 
> Michael Kelley (5):
>   x86/hyperv: Don't assume cpu_possible_mask is dense
>   Drivers: hv: Don't assume cpu_possible_mask is dense
>   iommu/hyper-v: Don't assume cpu_possible_mask is dense
>   scsi: storvsc: Don't assume cpu_possible_mask is dense
>   hv_netvsc: Don't assume cpu_possible_mask is dense

Thanks, these look sane.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
patchwork-bot+netdevbpf@kernel.org Oct. 4, 2024, 11:20 p.m. UTC | #2
Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed,  2 Oct 2024 20:53:28 -0700 you wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> Code specific to Hyper-V guests currently assumes the cpu_possible_mask
> is "dense" -- i.e., all bit positions 0 thru (nr_cpu_ids - 1) are set,
> with no "holes". Therefore, num_possible_cpus() is assumed to be equal
> to nr_cpu_ids.
> 
> [...]

Here is the summary with links:
  - [1/5] x86/hyperv: Don't assume cpu_possible_mask is dense
    (no matching commit)
  - [2/5] Drivers: hv: Don't assume cpu_possible_mask is dense
    (no matching commit)
  - [3/5] iommu/hyper-v: Don't assume cpu_possible_mask is dense
    (no matching commit)
  - [4/5] scsi: storvsc: Don't assume cpu_possible_mask is dense
    (no matching commit)
  - [net-next,5/5] hv_netvsc: Don't assume cpu_possible_mask is dense
    https://git.kernel.org/netdev/net-next/c/c86ab60b92d1

You are awesome, thank you!
Jakub Kicinski Oct. 4, 2024, 11:25 p.m. UTC | #3
On Fri, 04 Oct 2024 23:20:40 +0000 patchwork-bot+netdevbpf@kernel.org
wrote:
>   - [net-next,5/5] hv_netvsc: Don't assume cpu_possible_mask is dense
>     https://git.kernel.org/netdev/net-next/c/c86ab60b92d1

On reflection I probably should have asked you for a fixes tag and
applied it to net :S  Oh well.
Michael Kelley Oct. 4, 2024, 11:34 p.m. UTC | #4
From: Jakub Kicinski <kuba@kernel.org> Sent: Friday, October 4, 2024 4:26 PM
> 
> On Fri, 04 Oct 2024 23:20:40 +0000 patchwork-bot+netdevbpf@kernel.org
> wrote:
> >   - [net-next,5/5] hv_netvsc: Don't assume cpu_possible_mask is dense
> >
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=c86ab60b92d1
> 
> On reflection I probably should have asked you for a fixes tag and
> applied it to net :S  Oh well.

I thought about including a Fixes: tag, but decided that since current and
past builds don't exhibit a failure due to this issue, it's not necessary to
backport. The patches are more for robustness against future changes.
I've heard differing views on whether changes that aren't fixing an
actual failure should be backported ....

Also, patchwork bot email says that the *series* was applied to net-next.
But appears that's not the case.  Only the network-related patch from the
series was applied.  Still need the SCSI maintainer to pick up the SCSI patch,
and the Hyper-V maintainer to pick up the others.

Thanks,

Michael
Michael Kelley Dec. 10, 2024, 7:58 p.m. UTC | #5
From: mhkelley58@gmail.com <mhkelley58@gmail.com> Sent: Wednesday, October 2, 2024 8:53 PM
> 
> Code specific to Hyper-V guests currently assumes the cpu_possible_mask
> is "dense" -- i.e., all bit positions 0 thru (nr_cpu_ids - 1) are set,
> with no "holes". Therefore, num_possible_cpus() is assumed to be equal
> to nr_cpu_ids.
> 
> Per a separate discussion[1], this assumption is not valid in the
> general case. For example, the function setup_nr_cpu_ids() in
> kernel/smp.c is coded to assume cpu_possible_mask may be sparse,
> and other patches have been made in the past to correctly handle
> the sparseness. See bc75e99983df1efd ("rcu: Correctly handle sparse
> possible cpu") as noted by Mark Rutland.
> 
> The general case notwithstanding, the configurations that Hyper-V
> provides to guest VMs on x86 and ARM64 hardware, in combination
> with the algorithms currently used by architecture specific code
> to assign Linux CPU numbers, *does* always produce a dense
> cpu_possible_mask. So the invalid assumption is not currently
> causing failures. But in the interest of correctness, and robustness
> against future changes in the code that populates cpu_possible_mask,
> update the Hyper-V code to no longer assume denseness.
> 
> The typical code pattern with the invalid assumption is as follows:
> 
> 	array = kcalloc(num_possible_cpus(), sizeof(<some struct>),
> 			GFP_KERNEL);
> 	....
> 	index into "array" with smp_processor_id()
> 
> In such as case, the array might be indexed by a value beyond the size
> of the array. The correct approach is to allocate the array with size
> "nr_cpu_ids". While this will probably leave unused any array entries
> corresponding to holes in cpu_possible_mask, the holes are assumed to
> be minimal and hence the amount of memory wasted by unused entries is
> minimal.
> 
> Removing the assumption in Hyper-V code is done in several patches
> because they touch different kernel subsystems:
> 
> Patch 1: Hyper-V x86 initialization of hv_vp_assist_page (there's no
> 	 hv_vp_assist_page on ARM64)
> Patch 2: Hyper-V common init of hv_vp_index
> Patch 3: Hyper-V IOMMU driver
> Patch 4: storvsc driver
> Patch 5: netvsc driver

Wei --

Could you pick up Patches 1, 2, and 3 in this series for the hyperv-next
tree? Peter Zijlstra acked the full series [2], and Patches 4 and 5 have
already been picked by the SCSI and net maintainers respectively [3][4].

Let me know if you have any concerns.

Thanks,

Michael

[2] https://lore.kernel.org/linux-hyperv/20241004100742.GO18071@noisy.programming.kicks-ass.net/
[3] https://lore.kernel.org/linux-hyperv/yq15xnsjlc1.fsf@ca-mkp.ca.oracle.com/
[4] https://lore.kernel.org/linux-hyperv/172808404024.2772330.2975585273609596688.git-patchwork-notify@kernel.org/

> 
> I tested the changes by hacking the construction of cpu_possible_mask
> to include a hole on x86. With a configuration set to demonstrate the
> problem, a Hyper-V guest kernel eventually crashes due to memory
> corruption. After the patches in this series, the crash does not occur.
> 
> [1] https://lore.kernel.org/lkml/SN6PR02MB4157210CC36B2593F8572E5ED4692@SN6PR02MB4157.namprd02.prod.outlook.com/
> 
> Michael Kelley (5):
>   x86/hyperv: Don't assume cpu_possible_mask is dense
>   Drivers: hv: Don't assume cpu_possible_mask is dense
>   iommu/hyper-v: Don't assume cpu_possible_mask is dense
>   scsi: storvsc: Don't assume cpu_possible_mask is dense
>   hv_netvsc: Don't assume cpu_possible_mask is dense
> 
>  arch/x86/hyperv/hv_init.c       |  2 +-
>  drivers/hv/hv_common.c          |  4 ++--
>  drivers/iommu/hyperv-iommu.c    |  4 ++--
>  drivers/net/hyperv/netvsc_drv.c |  2 +-
>  drivers/scsi/storvsc_drv.c      | 13 ++++++-------
>  5 files changed, 12 insertions(+), 13 deletions(-)
> 
> --
> 2.25.1
>
Wei Liu Dec. 11, 2024, 12:14 a.m. UTC | #6
On Tue, Dec 10, 2024 at 07:58:34PM +0000, Michael Kelley wrote:
> From: mhkelley58@gmail.com <mhkelley58@gmail.com> Sent: Wednesday, October 2, 2024 8:53 PM
> > 
> > Code specific to Hyper-V guests currently assumes the cpu_possible_mask
> > is "dense" -- i.e., all bit positions 0 thru (nr_cpu_ids - 1) are set,
> > with no "holes". Therefore, num_possible_cpus() is assumed to be equal
> > to nr_cpu_ids.
> > 
> > Per a separate discussion[1], this assumption is not valid in the
> > general case. For example, the function setup_nr_cpu_ids() in
> > kernel/smp.c is coded to assume cpu_possible_mask may be sparse,
> > and other patches have been made in the past to correctly handle
> > the sparseness. See bc75e99983df1efd ("rcu: Correctly handle sparse
> > possible cpu") as noted by Mark Rutland.
> > 
> > The general case notwithstanding, the configurations that Hyper-V
> > provides to guest VMs on x86 and ARM64 hardware, in combination
> > with the algorithms currently used by architecture specific code
> > to assign Linux CPU numbers, *does* always produce a dense
> > cpu_possible_mask. So the invalid assumption is not currently
> > causing failures. But in the interest of correctness, and robustness
> > against future changes in the code that populates cpu_possible_mask,
> > update the Hyper-V code to no longer assume denseness.
> > 
> > The typical code pattern with the invalid assumption is as follows:
> > 
> > 	array = kcalloc(num_possible_cpus(), sizeof(<some struct>),
> > 			GFP_KERNEL);
> > 	....
> > 	index into "array" with smp_processor_id()
> > 
> > In such as case, the array might be indexed by a value beyond the size
> > of the array. The correct approach is to allocate the array with size
> > "nr_cpu_ids". While this will probably leave unused any array entries
> > corresponding to holes in cpu_possible_mask, the holes are assumed to
> > be minimal and hence the amount of memory wasted by unused entries is
> > minimal.
> > 
> > Removing the assumption in Hyper-V code is done in several patches
> > because they touch different kernel subsystems:
> > 
> > Patch 1: Hyper-V x86 initialization of hv_vp_assist_page (there's no
> > 	 hv_vp_assist_page on ARM64)
> > Patch 2: Hyper-V common init of hv_vp_index
> > Patch 3: Hyper-V IOMMU driver
> > Patch 4: storvsc driver
> > Patch 5: netvsc driver
> 
> Wei --
> 
> Could you pick up Patches 1, 2, and 3 in this series for the hyperv-next
> tree? Peter Zijlstra acked the full series [2], and Patches 4 and 5 have
> already been picked by the SCSI and net maintainers respectively [3][4].
> 
> Let me know if you have any concerns.

Michael, I will take a look later after I finish dealing with the
hyperv-fixes branch.

Thanks,
Wei.

> 
> Thanks,
> 
> Michael
> 
> [2] https://lore.kernel.org/linux-hyperv/20241004100742.GO18071@noisy.programming.kicks-ass.net/
> [3] https://lore.kernel.org/linux-hyperv/yq15xnsjlc1.fsf@ca-mkp.ca.oracle.com/
> [4] https://lore.kernel.org/linux-hyperv/172808404024.2772330.2975585273609596688.git-patchwork-notify@kernel.org/
> 
> > 
> > I tested the changes by hacking the construction of cpu_possible_mask
> > to include a hole on x86. With a configuration set to demonstrate the
> > problem, a Hyper-V guest kernel eventually crashes due to memory
> > corruption. After the patches in this series, the crash does not occur.
> > 
> > [1] https://lore.kernel.org/lkml/SN6PR02MB4157210CC36B2593F8572E5ED4692@SN6PR02MB4157.namprd02.prod.outlook.com/
> > 
> > Michael Kelley (5):
> >   x86/hyperv: Don't assume cpu_possible_mask is dense
> >   Drivers: hv: Don't assume cpu_possible_mask is dense
> >   iommu/hyper-v: Don't assume cpu_possible_mask is dense
> >   scsi: storvsc: Don't assume cpu_possible_mask is dense
> >   hv_netvsc: Don't assume cpu_possible_mask is dense
> > 
> >  arch/x86/hyperv/hv_init.c       |  2 +-
> >  drivers/hv/hv_common.c          |  4 ++--
> >  drivers/iommu/hyperv-iommu.c    |  4 ++--
> >  drivers/net/hyperv/netvsc_drv.c |  2 +-
> >  drivers/scsi/storvsc_drv.c      | 13 ++++++-------
> >  5 files changed, 12 insertions(+), 13 deletions(-)
> > 
> > --
> > 2.25.1
> >
Wei Liu Dec. 17, 2024, 7:21 p.m. UTC | #7
On Wed, Dec 11, 2024 at 12:14:50AM +0000, Wei Liu wrote:
> On Tue, Dec 10, 2024 at 07:58:34PM +0000, Michael Kelley wrote:
> > From: mhkelley58@gmail.com <mhkelley58@gmail.com> Sent: Wednesday, October 2, 2024 8:53 PM
> > > 
> > > Code specific to Hyper-V guests currently assumes the cpu_possible_mask
> > > is "dense" -- i.e., all bit positions 0 thru (nr_cpu_ids - 1) are set,
> > > with no "holes". Therefore, num_possible_cpus() is assumed to be equal
> > > to nr_cpu_ids.
> > > 
> > > Per a separate discussion[1], this assumption is not valid in the
> > > general case. For example, the function setup_nr_cpu_ids() in
> > > kernel/smp.c is coded to assume cpu_possible_mask may be sparse,
> > > and other patches have been made in the past to correctly handle
> > > the sparseness. See bc75e99983df1efd ("rcu: Correctly handle sparse
> > > possible cpu") as noted by Mark Rutland.
> > > 
> > > The general case notwithstanding, the configurations that Hyper-V
> > > provides to guest VMs on x86 and ARM64 hardware, in combination
> > > with the algorithms currently used by architecture specific code
> > > to assign Linux CPU numbers, *does* always produce a dense
> > > cpu_possible_mask. So the invalid assumption is not currently
> > > causing failures. But in the interest of correctness, and robustness
> > > against future changes in the code that populates cpu_possible_mask,
> > > update the Hyper-V code to no longer assume denseness.
> > > 
> > > The typical code pattern with the invalid assumption is as follows:
> > > 
> > > 	array = kcalloc(num_possible_cpus(), sizeof(<some struct>),
> > > 			GFP_KERNEL);
> > > 	....
> > > 	index into "array" with smp_processor_id()
> > > 
> > > In such as case, the array might be indexed by a value beyond the size
> > > of the array. The correct approach is to allocate the array with size
> > > "nr_cpu_ids". While this will probably leave unused any array entries
> > > corresponding to holes in cpu_possible_mask, the holes are assumed to
> > > be minimal and hence the amount of memory wasted by unused entries is
> > > minimal.
> > > 
> > > Removing the assumption in Hyper-V code is done in several patches
> > > because they touch different kernel subsystems:
> > > 
> > > Patch 1: Hyper-V x86 initialization of hv_vp_assist_page (there's no
> > > 	 hv_vp_assist_page on ARM64)
> > > Patch 2: Hyper-V common init of hv_vp_index
> > > Patch 3: Hyper-V IOMMU driver
> > > Patch 4: storvsc driver
> > > Patch 5: netvsc driver
> > 
> > Wei --
> > 
> > Could you pick up Patches 1, 2, and 3 in this series for the hyperv-next
> > tree? Peter Zijlstra acked the full series [2], and Patches 4 and 5 have
> > already been picked by the SCSI and net maintainers respectively [3][4].
> > 
> > Let me know if you have any concerns.
> 
> Michael, I will take a look later after I finish dealing with the
> hyperv-fixes branch.

Patch 1 to 3 have been applied to the hyperv-next branch.

Wei.