Message ID | 20240205210638.157741-14-haitao.huang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add Cgroup support for SGX EPC memory | expand |
On Mon, 2024-02-05 at 13:06 -0800, Haitao Huang wrote: > From: Kristen Carlson Accardi <kristen@linux.intel.com> > > Previous patches have implemented all infrastructure needed for > per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC > pages are still tracked in the global LRU as sgx_lru_list() returns hard > coded reference to the global LRU. > > Change sgx_lru_list() to return the LRU of the cgroup in which the given > EPC page is allocated. > > This makes all EPC pages tracked in per-cgroup LRUs and the global > reclaimer (ksgxd) will not be able to reclaim any pages from the global > LRU. However, in cases of over-committing, i.e., sum of cgroup limits > greater than the total capacity, cgroups may never reclaim but the total > usage can still be near the capacity. Therefore global reclamation is > still needed in those cases and it should reclaim from the root cgroup. > > Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup > when cgroup is enabled, otherwise from the global LRU. > > Similarly, modify sgx_can_reclaim(), to check emptiness of LRUs of all > cgroups when EPC cgroup is enabled, otherwise only check the global LRU. > > With these changes, the global reclamation and per-cgroup reclamation > both work properly with all pages tracked in per-cgroup LRUs. > > Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> > Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com> > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > --- > V7: > - Split this out from the big patch, #10 in V6. (Dave, Kai) > --- > arch/x86/kernel/cpu/sgx/main.c | 16 +++++++++++++++- > 1 file changed, 15 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index 6b0c26cac621..d4265a390ba9 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -34,12 +34,23 @@ static struct sgx_epc_lru_list sgx_global_lru; > > static inline struct sgx_epc_lru_list *sgx_lru_list(struct sgx_epc_page *epc_page) > { > +#ifdef CONFIG_CGROUP_SGX_EPC > + if (epc_page->epc_cg) > + return &epc_page->epc_cg->lru; > + > + /* This should not happen if kernel is configured correctly */ > + WARN_ON_ONCE(1); > +#endif > return &sgx_global_lru; > } How about when EPC cgroup is enabled, but one enclave doesn't belong to any EPC cgroup? Is it OK to track EPC pages for these enclaves to the root EPC cgroup's LRU list together with other enclaves belongs to the root cgroup? This should be a valid case, right?
On Wed, 21 Feb 2024 05:23:00 -0600, Huang, Kai <kai.huang@intel.com> wrote: > On Mon, 2024-02-05 at 13:06 -0800, Haitao Huang wrote: >> From: Kristen Carlson Accardi <kristen@linux.intel.com> >> >> Previous patches have implemented all infrastructure needed for >> per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC >> pages are still tracked in the global LRU as sgx_lru_list() returns hard >> coded reference to the global LRU. >> >> Change sgx_lru_list() to return the LRU of the cgroup in which the given >> EPC page is allocated. >> >> This makes all EPC pages tracked in per-cgroup LRUs and the global >> reclaimer (ksgxd) will not be able to reclaim any pages from the global >> LRU. However, in cases of over-committing, i.e., sum of cgroup limits >> greater than the total capacity, cgroups may never reclaim but the total >> usage can still be near the capacity. Therefore global reclamation is >> still needed in those cases and it should reclaim from the root cgroup. >> >> Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup >> when cgroup is enabled, otherwise from the global LRU. >> >> Similarly, modify sgx_can_reclaim(), to check emptiness of LRUs of all >> cgroups when EPC cgroup is enabled, otherwise only check the global LRU. >> >> With these changes, the global reclamation and per-cgroup reclamation >> both work properly with all pages tracked in per-cgroup LRUs. >> >> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com> >> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> >> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> >> Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com> >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> >> --- >> V7: >> - Split this out from the big patch, #10 in V6. (Dave, Kai) >> --- >> arch/x86/kernel/cpu/sgx/main.c | 16 +++++++++++++++- >> 1 file changed, 15 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/kernel/cpu/sgx/main.c >> b/arch/x86/kernel/cpu/sgx/main.c >> index 6b0c26cac621..d4265a390ba9 100644 >> --- a/arch/x86/kernel/cpu/sgx/main.c >> +++ b/arch/x86/kernel/cpu/sgx/main.c >> @@ -34,12 +34,23 @@ static struct sgx_epc_lru_list sgx_global_lru; >> >> static inline struct sgx_epc_lru_list *sgx_lru_list(struct >> sgx_epc_page *epc_page) >> { >> +#ifdef CONFIG_CGROUP_SGX_EPC >> + if (epc_page->epc_cg) >> + return &epc_page->epc_cg->lru; >> + >> + /* This should not happen if kernel is configured correctly */ >> + WARN_ON_ONCE(1); >> +#endif >> return &sgx_global_lru; >> } > > How about when EPC cgroup is enabled, but one enclave doesn't belong to > any EPC > cgroup? Is it OK to track EPC pages for these enclaves to the root EPC > cgroup's > LRU list together with other enclaves belongs to the root cgroup? > > > This should be a valid case, right? There is no such case. Each page is in the root by default. Thanks Haitao
On 23/02/2024 5:36 am, Haitao Huang wrote: > On Wed, 21 Feb 2024 05:23:00 -0600, Huang, Kai <kai.huang@intel.com> wrote: > >> On Mon, 2024-02-05 at 13:06 -0800, Haitao Huang wrote: >>> From: Kristen Carlson Accardi <kristen@linux.intel.com> >>> >>> Previous patches have implemented all infrastructure needed for >>> per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC >>> pages are still tracked in the global LRU as sgx_lru_list() returns hard >>> coded reference to the global LRU. >>> >>> Change sgx_lru_list() to return the LRU of the cgroup in which the given >>> EPC page is allocated. >>> >>> This makes all EPC pages tracked in per-cgroup LRUs and the global >>> reclaimer (ksgxd) will not be able to reclaim any pages from the global >>> LRU. However, in cases of over-committing, i.e., sum of cgroup limits >>> greater than the total capacity, cgroups may never reclaim but the total >>> usage can still be near the capacity. Therefore global reclamation is >>> still needed in those cases and it should reclaim from the root cgroup. >>> >>> Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup >>> when cgroup is enabled, otherwise from the global LRU. >>> >>> Similarly, modify sgx_can_reclaim(), to check emptiness of LRUs of all >>> cgroups when EPC cgroup is enabled, otherwise only check the global LRU. >>> >>> With these changes, the global reclamation and per-cgroup reclamation >>> both work properly with all pages tracked in per-cgroup LRUs. >>> >>> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com> >>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> >>> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> >>> Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com> >>> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> >>> --- >>> V7: >>> - Split this out from the big patch, #10 in V6. (Dave, Kai) >>> --- >>> arch/x86/kernel/cpu/sgx/main.c | 16 +++++++++++++++- >>> 1 file changed, 15 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/x86/kernel/cpu/sgx/main.c >>> b/arch/x86/kernel/cpu/sgx/main.c >>> index 6b0c26cac621..d4265a390ba9 100644 >>> --- a/arch/x86/kernel/cpu/sgx/main.c >>> +++ b/arch/x86/kernel/cpu/sgx/main.c >>> @@ -34,12 +34,23 @@ static struct sgx_epc_lru_list sgx_global_lru; >>> >>> static inline struct sgx_epc_lru_list *sgx_lru_list(struct >>> sgx_epc_page *epc_page) >>> { >>> +#ifdef CONFIG_CGROUP_SGX_EPC >>> + if (epc_page->epc_cg) >>> + return &epc_page->epc_cg->lru; >>> + >>> + /* This should not happen if kernel is configured correctly */ >>> + WARN_ON_ONCE(1); >>> +#endif >>> return &sgx_global_lru; >>> } >> >> How about when EPC cgroup is enabled, but one enclave doesn't belong >> to any EPC >> cgroup? Is it OK to track EPC pages for these enclaves to the root >> EPC cgroup's >> LRU list together with other enclaves belongs to the root cgroup? >> >> >> This should be a valid case, right? > > There is no such case. Each page is in the root by default. > Is it guaranteed by the (misc) cgroup design/implementation? If so please add this information to the changelog and/or comments? It helps non-cgroup expert like me to understand.
On Thu, 22 Feb 2024 16:44:45 -0600, Huang, Kai <kai.huang@intel.com> wrote: > > > On 23/02/2024 5:36 am, Haitao Huang wrote: >> On Wed, 21 Feb 2024 05:23:00 -0600, Huang, Kai <kai.huang@intel.com> >> wrote: >> >>> On Mon, 2024-02-05 at 13:06 -0800, Haitao Huang wrote: >>>> From: Kristen Carlson Accardi <kristen@linux.intel.com> >>>> >>>> Previous patches have implemented all infrastructure needed for >>>> per-cgroup EPC page tracking and reclaiming. But all reclaimable EPC >>>> pages are still tracked in the global LRU as sgx_lru_list() returns >>>> hard >>>> coded reference to the global LRU. >>>> >>>> Change sgx_lru_list() to return the LRU of the cgroup in which the >>>> given >>>> EPC page is allocated. >>>> >>>> This makes all EPC pages tracked in per-cgroup LRUs and the global >>>> reclaimer (ksgxd) will not be able to reclaim any pages from the >>>> global >>>> LRU. However, in cases of over-committing, i.e., sum of cgroup limits >>>> greater than the total capacity, cgroups may never reclaim but the >>>> total >>>> usage can still be near the capacity. Therefore global reclamation is >>>> still needed in those cases and it should reclaim from the root >>>> cgroup. >>>> >>>> Modify sgx_reclaim_pages_global(), to reclaim from the root EPC cgroup >>>> when cgroup is enabled, otherwise from the global LRU. >>>> >>>> Similarly, modify sgx_can_reclaim(), to check emptiness of LRUs of all >>>> cgroups when EPC cgroup is enabled, otherwise only check the global >>>> LRU. >>>> >>>> With these changes, the global reclamation and per-cgroup reclamation >>>> both work properly with all pages tracked in per-cgroup LRUs. >>>> >>>> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com> >>>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> >>>> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> >>>> Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com> >>>> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> >>>> --- >>>> V7: >>>> - Split this out from the big patch, #10 in V6. (Dave, Kai) >>>> --- >>>> arch/x86/kernel/cpu/sgx/main.c | 16 +++++++++++++++- >>>> 1 file changed, 15 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/arch/x86/kernel/cpu/sgx/main.c >>>> b/arch/x86/kernel/cpu/sgx/main.c >>>> index 6b0c26cac621..d4265a390ba9 100644 >>>> --- a/arch/x86/kernel/cpu/sgx/main.c >>>> +++ b/arch/x86/kernel/cpu/sgx/main.c >>>> @@ -34,12 +34,23 @@ static struct sgx_epc_lru_list sgx_global_lru; >>>> >>>> static inline struct sgx_epc_lru_list *sgx_lru_list(struct >>>> sgx_epc_page *epc_page) >>>> { >>>> +#ifdef CONFIG_CGROUP_SGX_EPC >>>> + if (epc_page->epc_cg) >>>> + return &epc_page->epc_cg->lru; >>>> + >>>> + /* This should not happen if kernel is configured correctly */ >>>> + WARN_ON_ONCE(1); >>>> +#endif >>>> return &sgx_global_lru; >>>> } >>> >>> How about when EPC cgroup is enabled, but one enclave doesn't belong >>> to any EPC >>> cgroup? Is it OK to track EPC pages for these enclaves to the root >>> EPC cgroup's >>> LRU list together with other enclaves belongs to the root cgroup? >>> >>> >>> This should be a valid case, right? >> There is no such case. Each page is in the root by default. >> > > Is it guaranteed by the (misc) cgroup design/implementation? If so > please add this information to the changelog and/or comments? It helps > non-cgroup expert like me to understand. > Will do Thanks Haitao
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 6b0c26cac621..d4265a390ba9 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -34,12 +34,23 @@ static struct sgx_epc_lru_list sgx_global_lru; static inline struct sgx_epc_lru_list *sgx_lru_list(struct sgx_epc_page *epc_page) { +#ifdef CONFIG_CGROUP_SGX_EPC + if (epc_page->epc_cg) + return &epc_page->epc_cg->lru; + + /* This should not happen if kernel is configured correctly */ + WARN_ON_ONCE(1); +#endif return &sgx_global_lru; } static inline bool sgx_can_reclaim(void) { +#ifdef CONFIG_CGROUP_SGX_EPC + return !sgx_epc_cgroup_lru_empty(misc_cg_root()); +#else return !list_empty(&sgx_global_lru.reclaimable); +#endif } static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); @@ -410,7 +421,10 @@ static void sgx_reclaim_pages_global(bool indirect) { unsigned int nr_to_scan = SGX_NR_TO_SCAN; - sgx_reclaim_pages(&sgx_global_lru, &nr_to_scan, indirect); + if (IS_ENABLED(CONFIG_CGROUP_SGX_EPC)) + sgx_epc_cgroup_reclaim_pages(misc_cg_root(), indirect); + else + sgx_reclaim_pages(&sgx_global_lru, &nr_to_scan, indirect); } /*