diff mbox series

[RFC] dom0less vcpu affinity bindings

Message ID alpine.DEB.2.22.394.2502101746240.619090@ubuntu-linux-20-04-desktop (mailing list archive)
State New
Headers show
Series [RFC] dom0less vcpu affinity bindings | expand

Commit Message

Stefano Stabellini Feb. 11, 2025, 1:56 a.m. UTC
Hi all,

We have received requests to introduce Dom0less vCPU affinity bindings
to allow configuring which pCPUs a given vCPU is allowed to run on.

After considering different approaches, I am thinking of using the
following binding format:

    vcpu0 {
           compatible = "xen,vcpu-affinity"; // compatible string
           id = <0>;  // vcpu id
           hard-affinity = "1,4-7"; // pcpu ranges
    };

Notably, the hard-affinity property is represented as a string.

We also considered using a bitmask, such as:

           hard-affinity = <0x0f>;

However, I decided against this approach because, on large server
systems, the number of physical CPUs can be very high, making the
bitmask size potentially very large. The string representation is more
practical for large systems and is also easier to understand and write.
It is also fully aligned with the way we have already implemented the
llc-colors option (see docs/misc/arm/device-tree/booting.txt and
docs/misc/cache-coloring.rst:).

What do you think?

Comments

Julien Grall Feb. 12, 2025, 11:09 p.m. UTC | #1
Hi Stefano,

On 11/02/2025 01:56, Stefano Stabellini wrote:
> We have received requests to introduce Dom0less vCPU affinity bindings
> to allow configuring which pCPUs a given vCPU is allowed to run on.
> 
> After considering different approaches, I am thinking of using the
> following binding format:
> 
>      vcpu0 {
>             compatible = "xen,vcpu-affinity"; // compatible string
>             id = <0>;  // vcpu id
>             hard-affinity = "1,4-7"; // pcpu ranges

This would be CPU logical ID, right? This is a value assigned by Xen 
based on how pCPU are brought up. So in theory it could change between 
Xen version as the order is not guaranteed. I know this is what the 
toolstack is currently using.

However, as we define a new binding, I wonder whether it would be better 
to instead have a phandle to the CPU device-tree node or just plain 
MPIDRs? This would guarantee that the vCPU will always land on a given 
pCPU (this could be important when taking into account the cache topology).

Cheers,
Stefano Stabellini Feb. 12, 2025, 11:51 p.m. UTC | #2
On Wed, 12 Feb 2025, Julien Grall wrote:
> Hi Stefano,
> 
> On 11/02/2025 01:56, Stefano Stabellini wrote:
> > We have received requests to introduce Dom0less vCPU affinity bindings
> > to allow configuring which pCPUs a given vCPU is allowed to run on.
> > 
> > After considering different approaches, I am thinking of using the
> > following binding format:
> > 
> >      vcpu0 {
> >             compatible = "xen,vcpu-affinity"; // compatible string
> >             id = <0>;  // vcpu id
> >             hard-affinity = "1,4-7"; // pcpu ranges
> 
> This would be CPU logical ID, right? This is a value assigned by Xen based on
> how pCPU are brought up. So in theory it could change between Xen version as
> the order is not guaranteed. I know this is what the toolstack is currently
> using.
> 
> However, as we define a new binding, I wonder whether it would be better to
> instead have a phandle to the CPU device-tree node or just plain MPIDRs? This
> would guarantee that the vCPU will always land on a given pCPU (this could be
> important when taking into account the cache topology).

Yes I can see that your suggestions would make the configuration more
precise. I was hoping to be able to make the binding arch-neutral so
that it could be used the same way on x86 (and also on RISC-V).

I would prefer to avoid the link to the pCPU device tree node because
hyperlaunch doesn't have the pCPU nodes, and also even for ARM and
RISC-V I think it would be inconveniet to manage the phandle. But maybe
we can find a way to support the MPIDR.

We could add support to the MPIDR either directly to hard-affinity,
because it should be possible to detect whether the input is unsigned
integers or an MPIDR, or with a second arch-specific property, such as:

vcpu0 {
       compatible = "xen,vcpu-affinity"; // compatible string
       id = <0>;  // vcpu id
       arm-mpidr = <0x80000000>; // MPIDR

What do you think?
diff mbox series

Patch

diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index 49d1f14d65..12379ecb20 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -818,6 +818,8 @@  void __init create_domUs(void)
     const struct dt_device_node *cpupool_node,
                                 *chosen = dt_find_node_by_path("/chosen");
     const char *llc_colors_str = NULL;
+    const char *hard_affinity_str = NULL;
+    struct dt_device_node *np;
 
     BUG_ON(chosen == NULL);
     dt_for_each_child_node(chosen, node)
@@ -992,6 +994,55 @@  void __init create_domUs(void)
         if ( rc )
             panic("Could not set up domain %s (rc = %d)\n",
                   dt_node_name(node), rc);
+
+        dt_for_each_child_node(node, np)
+        {
+            const char *s;
+            struct vcpu *v;
+            cpumask_t affinity;
+
+            if ( !dt_device_is_compatible(np, "xen,vcpu-affinity") )
+                continue;
+
+            if ( !dt_property_read_u32(np, "id", &val) )
+                continue;
+            if ( val >= d->max_vcpus )
+                panic("Invalid vcpu_id %u for domain %s\n", val, dt_node_name(node));
+
+            v = d->vcpu[val];
+            rc = dt_property_read_string(np, "hard-affinity", &hard_affinity_str);
+            if ( rc < 0 )
+                continue;
+            
+            s = hard_affinity_str;
+            cpumask_clear(&affinity);
+            while ( *s != '\0' )
+            {
+                unsigned int start, end;
+
+                start = simple_strtoul(s, &s, 0);
+
+                if ( *s == '-' )    /* Range */
+                {
+                    s++;
+                    end = simple_strtoul(s, &s, 0);
+                }
+                else                /* Single value */
+                    end = start;
+
+                for ( ; start <= end; start++ )
+                    cpumask_set_cpu(start, &affinity);
+
+                if ( *s == ',' )
+                    s++;
+                else if ( *s != '\0' )
+                    break;
+            }
+
+            rc = vcpu_set_hard_affinity(v, &affinity);
+            if ( rc )
+                panic("vcpu%d: failed to set hard affinity\n", v->vcpu_id);
+        }
     }
 }