mbox series

[v10,0/7] Basic recovery for machine checks inside SGX

Message ID 20211018202542.584115-1-tony.luck@intel.com (mailing list archive)
Headers show
Series Basic recovery for machine checks inside SGX | expand

Message

Luck, Tony Oct. 18, 2021, 8:25 p.m. UTC
v10 (based on v5.15-rc6)

Changes since v9:

ACPI reviewers (Rafael): No changes to parts 6 & 7.

MM reviewers (Horiguchi-san): No changes to part 5.

Jarkko:
	Added Reviewed-by tags to remaining patches.
	N.B. I kept the tags on parts 1, 3, 4 because
	changes based on Sean feedback didn't seem
	consequential. Please let me know if you disagree
	and see new problems introduced by me trying to
	follow Sean's feedback.

Sean:
	1) Reverse the polarity of the neutron flow (sorry,
	Dr Who fan will always autocomplete a sentence that
	begins "reverse the polarity" that way.) Actual change
	is for the new flag bit. Instead of marking in-use
	pages with the new bit, mark free pages instead. This
	avoids the weirdness where I marked the pages on the
	dirty list as "in-use", when clearly they are not.

	2) Race conditions adding poisoned pages to the global
	list of poisoned pages.
	Fixed this by changing from a global list to a per-node
	list. Additions are protected by the node->lock.

	3) Use list_move() instead of list_del(); list_add()
	Fixed both places I used this idiom.

	4) Race looking at page->poison when cleaning dirty pages.
	Added a comment documenting why losing this race isn't
	overly harmful.

Tony Luck (7):
  x86/sgx: Add new sgx_epc_page flag bit to mark free pages
  x86/sgx: Add infrastructure to identify SGX EPC pages
  x86/sgx: Initial poison handling for dirty and free pages
  x86/sgx: Add SGX infrastructure to recover from poison
  x86/sgx: Hook arch_memory_failure() into mainline code
  x86/sgx: Add hook to error injection address validation
  x86/sgx: Add check for SGX pages to ghes_do_memory_failure()

 .../firmware-guide/acpi/apei/einj.rst         |  19 +++
 arch/x86/include/asm/processor.h              |   8 ++
 arch/x86/include/asm/set_memory.h             |   4 +
 arch/x86/kernel/cpu/sgx/main.c                | 113 +++++++++++++++++-
 arch/x86/kernel/cpu/sgx/sgx.h                 |   7 +-
 drivers/acpi/apei/einj.c                      |   3 +-
 drivers/acpi/apei/ghes.c                      |   2 +-
 include/linux/mm.h                            |  14 +++
 mm/memory-failure.c                           |  19 ++-
 9 files changed, 179 insertions(+), 10 deletions(-)


base-commit: 519d81956ee277b4419c723adfb154603c2565ba