diff mbox

[3/7] HACK: HobLib: workaround infinite loop

Message ID 20180223132311.26555-4-marcandre.lureau@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Marc-André Lureau Feb. 23, 2018, 1:23 p.m. UTC
From: Marc-André Lureau <marcandre.lureau@redhat.com>

Without this hack, GetNextHob() loops infinitely with the next patch.
I don't understand the reason.

The loop is triggered by the GetFirstGuidHob (&gTpmErrorHobGuid) call.

CC: Laszlo Ersek <lersek@redhat.com>
CC: Stefan Berger <stefanb@linux.vnet.ibm.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 MdePkg/Library/PeiHobLib/HobLib.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Laszlo Ersek Feb. 23, 2018, 7:14 p.m. UTC | #1
On 02/23/18 14:23, marcandre.lureau@redhat.com wrote:
> From: Marc-André Lureau <marcandre.lureau@redhat.com>
> 
> Without this hack, GetNextHob() loops infinitely with the next patch.
> I don't understand the reason.
> 
> The loop is triggered by the GetFirstGuidHob (&gTpmErrorHobGuid) call.
> 
> CC: Laszlo Ersek <lersek@redhat.com>
> CC: Stefan Berger <stefanb@linux.vnet.ibm.com>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  MdePkg/Library/PeiHobLib/HobLib.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/MdePkg/Library/PeiHobLib/HobLib.c b/MdePkg/Library/PeiHobLib/HobLib.c
> index 5c0eeb992f..ed3c5fbd6d 100644
> --- a/MdePkg/Library/PeiHobLib/HobLib.c
> +++ b/MdePkg/Library/PeiHobLib/HobLib.c
> @@ -89,6 +89,10 @@ GetNextHob (
>      if (Hob.Header->HobType == Type) {
>        return Hob.Raw;
>      }
> +    if (GET_HOB_LENGTH (HobStart) == 0) {
> +        DEBUG ((DEBUG_INFO, "FIXME: GetNextHob length == 0"));
> +        return NULL;
> +    }
>      Hob.Raw = GET_NEXT_HOB (Hob);
>    }
>    return NULL;
> 

Strange. The HobLength field is supposed to include the size of the HOB header, so it should never be zero.

Furthermore, the PEI core initializes the HOB list; it should be terminated with an End-of-HOB-List HOB:

PeiCore()                             [MdeModulePkg/Core/Pei/PeiMain/PeiMain.c]
  InitializeMemoryServices()          [MdeModulePkg/Core/Pei/Memory/MemoryServices.c]
    PeiCoreBuildHobHandoffInfoTable() [MdeModulePkg/Core/Pei/Hob/Hob.c]

I tried to reproduce this issue by:
- applying patches 1, 2, and 4
- in function PeimEntryMA(), file "SecurityPkg/Tcg/Tcg2Pei/Tcg2Pei.c", moving the GetFirstGuidHob (&gTpmErrorHobGuid) call to the top of the function.

It didn't hang for me.

Laszlo
Andrew Fish Feb. 23, 2018, 7:45 p.m. UTC | #2
> On Feb 23, 2018, at 5:23 AM, marcandre.lureau@redhat.com wrote:
> 
> From: Marc-André Lureau <marcandre.lureau@redhat.com>
> 
> Without this hack, GetNextHob() loops infinitely with the next patch.
> I don't understand the reason.
> 
> The loop is triggered by the GetFirstGuidHob (&gTpmErrorHobGuid) call.
> 
> CC: Laszlo Ersek <lersek@redhat.com>
> CC: Stefan Berger <stefanb@linux.vnet.ibm.com>
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
> MdePkg/Library/PeiHobLib/HobLib.c | 4 ++++
> 1 file changed, 4 insertions(+)
> 
> diff --git a/MdePkg/Library/PeiHobLib/HobLib.c b/MdePkg/Library/PeiHobLib/HobLib.c
> index 5c0eeb992f..ed3c5fbd6d 100644
> --- a/MdePkg/Library/PeiHobLib/HobLib.c
> +++ b/MdePkg/Library/PeiHobLib/HobLib.c
> @@ -89,6 +89,10 @@ GetNextHob (
>     if (Hob.Header->HobType == Type) {
>       return Hob.Raw;
>     }
> +    if (GET_HOB_LENGTH (HobStart) == 0) {

As Laszlo points out this error condition is likely memory corruption. Thus it would be better to check for all know illegal values? 

if (GET_HOB_LENGTH(HobStart) < sizeof (EFI_HOB_GENERIC_HEADER)

Thanks,

Andrew Fish

> +        DEBUG ((DEBUG_INFO, "FIXME: GetNextHob length == 0"));
> +        return NULL;
> +    }
>     Hob.Raw = GET_NEXT_HOB (Hob);
>   }
>   return NULL;
> -- 
> 2.16.1.73.g5832b7e9f2
> 
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
Marc-André Lureau March 5, 2018, 2:05 p.m. UTC | #3
Hi

On Fri, Feb 23, 2018 at 8:45 PM, Andrew Fish <afish@apple.com> wrote:
>
>
>> On Feb 23, 2018, at 5:23 AM, marcandre.lureau@redhat.com wrote:
>>
>> From: Marc-André Lureau <marcandre.lureau@redhat.com>
>>
>> Without this hack, GetNextHob() loops infinitely with the next patch.
>> I don't understand the reason.
>>
>> The loop is triggered by the GetFirstGuidHob (&gTpmErrorHobGuid) call.
>>
>> CC: Laszlo Ersek <lersek@redhat.com>
>> CC: Stefan Berger <stefanb@linux.vnet.ibm.com>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>> ---
>> MdePkg/Library/PeiHobLib/HobLib.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/MdePkg/Library/PeiHobLib/HobLib.c b/MdePkg/Library/PeiHobLib/HobLib.c
>> index 5c0eeb992f..ed3c5fbd6d 100644
>> --- a/MdePkg/Library/PeiHobLib/HobLib.c
>> +++ b/MdePkg/Library/PeiHobLib/HobLib.c
>> @@ -89,6 +89,10 @@ GetNextHob (
>>     if (Hob.Header->HobType == Type) {
>>       return Hob.Raw;
>>     }
>> +    if (GET_HOB_LENGTH (HobStart) == 0) {
>
> As Laszlo points out this error condition is likely memory corruption. Thus it would be better to check for all know illegal values?
>
> if (GET_HOB_LENGTH(HobStart) < sizeof (EFI_HOB_GENERIC_HEADER)
>

Thanks, I have adjusted the check.

With manual calls and printf (I don't know  a better way to debug ovmf
;), I try to locate the issue. It's somehow related to
RegisterForShadow(). The "corruption" seems to happen during the
second call. After the
PeiLoadImage(...,PEIM_STATE_REGISTER_FOR_SHADOW,..), right before
calling PeimEntryPoint(), a GetFirstGuidHob() succeed, but inside the
function, it fails (with the same arguments). Right after it succeeds
again... The PeimEntryPoint() is not the Tcg2Pei:PeimEntryMA(), I
suppose there is some kind of wrapping code, but I fail to find where.
Any idea?

thanks for your help
Laszlo Ersek March 5, 2018, 6:22 p.m. UTC | #4
On 03/05/18 15:05, Marc-André Lureau wrote:
> Hi
>
> On Fri, Feb 23, 2018 at 8:45 PM, Andrew Fish <afish@apple.com> wrote:
>>
>>
>>> On Feb 23, 2018, at 5:23 AM, marcandre.lureau@redhat.com wrote:
>>>
>>> From: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>
>>> Without this hack, GetNextHob() loops infinitely with the next
>>> patch. I don't understand the reason.
>>>
>>> The loop is triggered by the GetFirstGuidHob (&gTpmErrorHobGuid)
>>> call.
>>>
>>> CC: Laszlo Ersek <lersek@redhat.com>
>>> CC: Stefan Berger <stefanb@linux.vnet.ibm.com>
>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>>> ---
>>> MdePkg/Library/PeiHobLib/HobLib.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/MdePkg/Library/PeiHobLib/HobLib.c b/MdePkg/Library/PeiHobLib/HobLib.c
>>> index 5c0eeb992f..ed3c5fbd6d 100644
>>> --- a/MdePkg/Library/PeiHobLib/HobLib.c
>>> +++ b/MdePkg/Library/PeiHobLib/HobLib.c
>>> @@ -89,6 +89,10 @@ GetNextHob (
>>>     if (Hob.Header->HobType == Type) {
>>>       return Hob.Raw;
>>>     }
>>> +    if (GET_HOB_LENGTH (HobStart) == 0) {
>>
>> As Laszlo points out this error condition is likely memory
>> corruption. Thus it would be better to check for all know illegal
>> values?
>>
>> if (GET_HOB_LENGTH(HobStart) < sizeof (EFI_HOB_GENERIC_HEADER)
>>
>
> Thanks, I have adjusted the check.
>
> With manual calls and printf (I don't know  a better way to debug ovmf
> ;),

Well that's how I generally debug it too :)

> I try to locate the issue. It's somehow related to
> RegisterForShadow(). The "corruption" seems to happen during the
> second call. After the
> PeiLoadImage(...,PEIM_STATE_REGISTER_FOR_SHADOW,..), right before
> calling PeimEntryPoint(), a GetFirstGuidHob() succeed, but inside the
> function, it fails (with the same arguments). Right after it succeeds
> again... The PeimEntryPoint() is not the Tcg2Pei:PeimEntryMA(), I
> suppose there is some kind of wrapping code, but I fail to find where.
> Any idea?

This sounds helpful. I don't know what the problem is, but I can
elaborate on your details a bit; perhaps someone else will have more
ideas.

Apparently there is a PEI service called RegisterForShadow().
("Apparently", because I've never seen, let alone written, a PEIM
calling this service.) The service is specified in the PI spec, which is
quoted in the edk2 tree [MdePkg/Include/Pi/PiPeiCis.h]:

> /**
>   Register a PEIM so that it will be shadowed and called again.
>
>   This service registers a file handle so that after memory is
>   available, the PEIM will be re-loaded into permanent memory and
>   re-initialized. The PEIM registered this way will always be
>   initialized twice. The first time, this function call will
>   return EFI_SUCCESS. The second time, this function call will
>   return EFI_ALREADY_STARTED. Depending on the order in which
>   PEIMs are dispatched, the PEIM making this call may be
>   initialized after permanent memory is installed, even the first
>   time.
>
>   @param  FileHandle            PEIM's file handle. Must be the currently
>                                 executing PEIM.
>
>   @retval EFI_SUCCESS           The PEIM was successfully registered for
>                                 shadowing.
>   @retval EFI_ALREADY_STARTED   The PEIM was previously
>                                 registered for shadowing.
>   @retval EFI_NOT_FOUND         The FileHandle does not refer to a
>                                 valid file handle.
>
> **/
> typedef
> EFI_STATUS
> (EFIAPI *EFI_PEI_REGISTER_FOR_SHADOW)(
>   IN  EFI_PEI_FILE_HANDLE FileHandle
>   );

PEIMs generally "execute in place" (XIP), i.e. they run from flash, not
RAM. In this status they use "temporary RAM" (e.g. CPU caches configured
like RAM) for stack & heap; whatever HOBs they produce are stored in
"temp RAM" as well. Then one of the PEIMs "discovers permanent RAM"
(basically it programs the memory controller and publishes the RAM
ranges). In turn the PEI core "migrates" PEIMs from temporary to
permanent RAM, including the HOB list.

Before the temporary RAM migration (when still executing in place from
flash), PEIMs cannot have writeable global variables. For example,
dynamic PCDs are also maintained in a HOB (the PCD HOB).

A PEIM normally cannot (and shouldn't) tell whether it is dispatched
before or after permanent RAM is published. If needed, a PEIM can
advertise that it depends on permanent RAM (for example because it needs
a lot of heap memory) by including "gEfiPeiMemoryDiscoveredPpiGuid" in
its DEPEX.

Finally, it seems like a PEIM can also express, "I'm fine with being
dispatched from both flash (XIP) vs. permanent RAM, just the PEI core
tell me whichever it is". Apparently, if the PEIM is dispatched from
flash (before permanent RAM is available), its call to
RegisterForShadow() returns EFI_SUCCESS (and then its entry point
function will be invoked for a 2nd time, after the temp RAM migration).
And when a PEIM is dispatched from RAM (either for the very first time,
or for the second time, after being dispatched from flash first), the
same call returns EFI_ALREADY_STARTED.

Honestly, I'm unsure what this is good for (both in general, and
specifically for Tcg2Pei). Apparently, Tcg2Pei needs permanent RAM for
doing the measurements (which makes sense); I just wonder what exactly
it does so that it cannot simply specify
"gEfiPeiMemoryDiscoveredPpiGuid" in its DEPEX.

I do see that the (!mImageInMemory) branch contains the TPM
initialization code. However, as the PI spec itself says, it is not
guaranteed that a PEIM will be dispatched from XIP (i.e., before
permanent RAM) *at all*. So it's not clear to me how "business
functionality" can depend on XIP.


Now, OVMF is a bit different wrt. "memory controller programming" -- it
runs on virtual platforms where programming the memory controllers is
unnecessary. What happens is, instead, that only the SEC phase runs from
flash (XIP), and it decompresses even the PEI firmware volume to normal
RAM. The phase where PEIMs "run from flash" (XIP) still exists, of
course, except they actually run from RAM -- so, for example, they have
writeable global variables right off the bat. Perhaps this interferes
somehow with the RegisterForShadow() service and/or how PEIMs expect
that service to work.

Regarding the PEIM entry point -- the PEIMs' "own" entry point functions
are always wrapped.

- The outermost (common) entry point function is called
  _ModuleEntryPoint(). It is declared in
  "MdePkg/Include/Library/PeimEntryPoint.h". This is what the PEI core
  calls when dispatching a PEIM.

- Individual PEIMs add the PeimEntryPoint lib class to their INF files,
  under [LibraryClasses]. The implementation is in
  "MdePkg/Library/PeimEntryPoint". In particular, the function calls
  ProcessLibraryConstructorList().

- The build tools generate the source code for
  ProcessLibraryConstructorList(); based on the (recursively determined)
  library instance dependency tree. (Search the Build subdirectory for
  "AutoGen.c" files.) So, before the PEIM's actual entry point is
  called, the lib instances' CONSTRUCTOR functions are called, in the
  right order, so that the PEIM is entered with all libraries "ready to
  use".

I guess that, when Tcg2Pei is dispatched for the 2nd time, from
permanent RAM, the first (successful) GetFirstGuidHob() call that you
see occurs like this, from generated code, as part of library
construction. Some library instance's constructor function calls
GetFirstGuidHob(), successfully.

The corruption could somehow be related to the HOB list's migration from
temp RAM to permanent RAM. The OVMF debug log should contain something
like this:

> Temp Stack : BaseAddress=0x818000 Length=0x8000
> Temp Heap  : BaseAddress=0x810000 Length=0x8000
> Total temporary memory:    65536 bytes.
>   temporary memory stack ever used:       28656 bytes.
>   temporary memory heap used for HobList: 6056 bytes.
>   temporary memory heap occupied by memory pages: 0 bytes.

Implying that, before the temp RAM migration, the HOB list consumed ~6KB
in total of the 32KB temp RAM heap. It seems unlikely that we run out of
temp RAM heap.

Hopefully this helps you proceed with the debugging... Corrections are
welcome too, obviously!

Thanks,
Laszlo
Andrew Fish March 5, 2018, 8:18 p.m. UTC | #5
> On Mar 5, 2018, at 10:22 AM, Laszlo Ersek <lersek@redhat.com> wrote:
> 
> On 03/05/18 15:05, Marc-André Lureau wrote:
>> Hi
>> 
>> On Fri, Feb 23, 2018 at 8:45 PM, Andrew Fish <afish@apple.com> wrote:
>>> 
>>> 
>>>> On Feb 23, 2018, at 5:23 AM, marcandre.lureau@redhat.com wrote:
>>>> 
>>>> From: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>> 
>>>> Without this hack, GetNextHob() loops infinitely with the next
>>>> patch. I don't understand the reason.
>>>> 
>>>> The loop is triggered by the GetFirstGuidHob (&gTpmErrorHobGuid)
>>>> call.
>>>> 
>>>> CC: Laszlo Ersek <lersek@redhat.com>
>>>> CC: Stefan Berger <stefanb@linux.vnet.ibm.com>
>>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>>>> ---
>>>> MdePkg/Library/PeiHobLib/HobLib.c | 4 ++++
>>>> 1 file changed, 4 insertions(+)
>>>> 
>>>> diff --git a/MdePkg/Library/PeiHobLib/HobLib.c b/MdePkg/Library/PeiHobLib/HobLib.c
>>>> index 5c0eeb992f..ed3c5fbd6d 100644
>>>> --- a/MdePkg/Library/PeiHobLib/HobLib.c
>>>> +++ b/MdePkg/Library/PeiHobLib/HobLib.c
>>>> @@ -89,6 +89,10 @@ GetNextHob (
>>>>    if (Hob.Header->HobType == Type) {
>>>>      return Hob.Raw;
>>>>    }
>>>> +    if (GET_HOB_LENGTH (HobStart) == 0) {
>>> 
>>> As Laszlo points out this error condition is likely memory
>>> corruption. Thus it would be better to check for all know illegal
>>> values?
>>> 
>>> if (GET_HOB_LENGTH(HobStart) < sizeof (EFI_HOB_GENERIC_HEADER)
>>> 
>> 
>> Thanks, I have adjusted the check.
>> 
>> With manual calls and printf (I don't know  a better way to debug ovmf
>> ;),
> 
> Well that's how I generally debug it too :)
> 
>> I try to locate the issue. It's somehow related to
>> RegisterForShadow(). The "corruption" seems to happen during the
>> second call. After the
>> PeiLoadImage(...,PEIM_STATE_REGISTER_FOR_SHADOW,..), right before
>> calling PeimEntryPoint(), a GetFirstGuidHob() succeed, but inside the
>> function, it fails (with the same arguments). Right after it succeeds
>> again... The PeimEntryPoint() is not the Tcg2Pei:PeimEntryMA(), I
>> suppose there is some kind of wrapping code, but I fail to find where.
>> Any idea?
> 
> This sounds helpful. I don't know what the problem is, but I can
> elaborate on your details a bit; perhaps someone else will have more
> ideas.
> 
> Apparently there is a PEI service called RegisterForShadow().
> ("Apparently", because I've never seen, let alone written, a PEIM
> calling this service.) The service is specified in the PI spec, which is
> quoted in the edk2 tree [MdePkg/Include/Pi/PiPeiCis.h]:
> 

In the "olden days" the PEI Core did not shadow PEIMs. There were a few PEIMs that had some hacky code to shadow themselves into memory. We thought of making it a library, but there was not a clean way to detect if the PEIM was shadowed. So we ended up adding the PEI Service. You could also use RegisterForShadow with a DEPEX of gEfiPeiMemoryDiscoveredPpiGuid to cause your PEIM to get shadowed even if the PEI Core did not support automatically shadowing PEIMs after memory showed up. 

Anyway the RegisterForShadow can cause the entry point for the PEIM to get called again, and if there is a bug handling that I imagine bad things can happen. 

There are plenty examples of RegisterForShadow usage in the edk2. Maybe looking at how other PEIMs are using it would be helpful. 

(master)>git grep ".RegisterForShadow"
SecurityPkg/Tcg/Tcg2Pei/Tcg2Pei.c:852:    Status = (**PeiServices).RegisterForShadow(FileHandle);
SecurityPkg/Tcg/TcgPei/TcgPei.c:776:    Status = (**PeiServices).RegisterForShadow(FileHandle);
SecurityPkg/Tcg/TrEEPei/TrEEPei.c:616:    Status = (**PeiServices).RegisterForShadow(FileHandle);
(master)>git grep "PeiServicesRegisterForShadow"
EmulatorPkg/Library/SecPeiServicesLib/PeiServicesLib.c:423:PeiServicesRegisterForShadow (
FatPkg/FatPei/FatLiteApi.c:258:  Status = PeiServicesRegisterForShadow (FileHandle);
IntelFrameworkModulePkg/Bus/Isa/IsaFloppyPei/FloppyPeim.c:1725:  Status = PeiServicesRegisterForShadow (FileHandle);
MdeModulePkg/Bus/Pci/EhciPei/EhcPeim.c:1199:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Bus/Pci/IdeBusPei/AtapiPeim.c:44:  Status = PeiServicesRegisterForShadow (FileHandle);
MdeModulePkg/Bus/Pci/SdMmcPciHcPei/SdMmcPciHcPei.c:103:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Bus/Pci/UfsPciHcPei/UfsPciHcPei.c:92:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Bus/Pci/UhciPei/UhcPeim.c:121:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Bus/Pci/XhciPei/XhcPeim.c:1462:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Bus/Sd/EmmcBlockIoPei/EmmcBlockIoPei.c:693:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Bus/Sd/SdBlockIoPei/SdBlockIoPei.c:540:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Bus/Ufs/UfsBlockIoPei/UfsBlockIoPei.c:1147:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Bus/Usb/UsbBotPei/UsbBotPeim.c:96:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Bus/Usb/UsbBusPei/UsbPeim.c:140:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdeModulePkg/Core/DxeIplPeim/DxeLoad.c:87:    Status = PeiServicesRegisterForShadow (FileHandle);
MdeModulePkg/Universal/Disk/CdExpressPei/PeiCdExpress.c:44:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
MdePkg/Include/Library/PeiServicesLib.h:464:PeiServicesRegisterForShadow (
MdePkg/Library/PeiServicesLib/PeiServicesLib.c:474:PeiServicesRegisterForShadow (
QuarkPlatformPkg/Platform/Pei/PlatformInit/PlatformEarlyInit.c:595:    PeiServicesRegisterForShadow (FileHandle);
QuarkSocPkg/QuarkSouthCluster/Usb/Ohci/Pei/OhcPeim.c:1302:  if (!EFI_ERROR (PeiServicesRegisterForShadow (FileHandle))) {
UefiCpuPkg/CpuIoPei/CpuIoPei.c:898:  Status = PeiServicesRegisterForShadow (FileHandle);


Thanks,

Andrew Fish


>> /**
>>  Register a PEIM so that it will be shadowed and called again.
>> 
>>  This service registers a file handle so that after memory is
>>  available, the PEIM will be re-loaded into permanent memory and
>>  re-initialized. The PEIM registered this way will always be
>>  initialized twice. The first time, this function call will
>>  return EFI_SUCCESS. The second time, this function call will
>>  return EFI_ALREADY_STARTED. Depending on the order in which
>>  PEIMs are dispatched, the PEIM making this call may be
>>  initialized after permanent memory is installed, even the first
>>  time.
>> 
>>  @param  FileHandle            PEIM's file handle. Must be the currently
>>                                executing PEIM.
>> 
>>  @retval EFI_SUCCESS           The PEIM was successfully registered for
>>                                shadowing.
>>  @retval EFI_ALREADY_STARTED   The PEIM was previously
>>                                registered for shadowing.
>>  @retval EFI_NOT_FOUND         The FileHandle does not refer to a
>>                                valid file handle.
>> 
>> **/
>> typedef
>> EFI_STATUS
>> (EFIAPI *EFI_PEI_REGISTER_FOR_SHADOW)(
>>  IN  EFI_PEI_FILE_HANDLE FileHandle
>>  );
> 
> PEIMs generally "execute in place" (XIP), i.e. they run from flash, not
> RAM. In this status they use "temporary RAM" (e.g. CPU caches configured
> like RAM) for stack & heap; whatever HOBs they produce are stored in
> "temp RAM" as well. Then one of the PEIMs "discovers permanent RAM"
> (basically it programs the memory controller and publishes the RAM
> ranges). In turn the PEI core "migrates" PEIMs from temporary to
> permanent RAM, including the HOB list.
> 
> Before the temporary RAM migration (when still executing in place from
> flash), PEIMs cannot have writeable global variables. For example,
> dynamic PCDs are also maintained in a HOB (the PCD HOB).
> 
> A PEIM normally cannot (and shouldn't) tell whether it is dispatched
> before or after permanent RAM is published. If needed, a PEIM can
> advertise that it depends on permanent RAM (for example because it needs
> a lot of heap memory) by including "gEfiPeiMemoryDiscoveredPpiGuid" in
> its DEPEX.
> 
> Finally, it seems like a PEIM can also express, "I'm fine with being
> dispatched from both flash (XIP) vs. permanent RAM, just the PEI core
> tell me whichever it is". Apparently, if the PEIM is dispatched from
> flash (before permanent RAM is available), its call to
> RegisterForShadow() returns EFI_SUCCESS (and then its entry point
> function will be invoked for a 2nd time, after the temp RAM migration).
> And when a PEIM is dispatched from RAM (either for the very first time,
> or for the second time, after being dispatched from flash first), the
> same call returns EFI_ALREADY_STARTED.
> 
> Honestly, I'm unsure what this is good for (both in general, and
> specifically for Tcg2Pei). Apparently, Tcg2Pei needs permanent RAM for
> doing the measurements (which makes sense); I just wonder what exactly
> it does so that it cannot simply specify
> "gEfiPeiMemoryDiscoveredPpiGuid" in its DEPEX.
> 
> I do see that the (!mImageInMemory) branch contains the TPM
> initialization code. However, as the PI spec itself says, it is not
> guaranteed that a PEIM will be dispatched from XIP (i.e., before
> permanent RAM) *at all*. So it's not clear to me how "business
> functionality" can depend on XIP.
> 
> 
> Now, OVMF is a bit different wrt. "memory controller programming" -- it
> runs on virtual platforms where programming the memory controllers is
> unnecessary. What happens is, instead, that only the SEC phase runs from
> flash (XIP), and it decompresses even the PEI firmware volume to normal
> RAM. The phase where PEIMs "run from flash" (XIP) still exists, of
> course, except they actually run from RAM -- so, for example, they have
> writeable global variables right off the bat. Perhaps this interferes
> somehow with the RegisterForShadow() service and/or how PEIMs expect
> that service to work.
> 
> Regarding the PEIM entry point -- the PEIMs' "own" entry point functions
> are always wrapped.
> 
> - The outermost (common) entry point function is called
>  _ModuleEntryPoint(). It is declared in
>  "MdePkg/Include/Library/PeimEntryPoint.h". This is what the PEI core
>  calls when dispatching a PEIM.
> 
> - Individual PEIMs add the PeimEntryPoint lib class to their INF files,
>  under [LibraryClasses]. The implementation is in
>  "MdePkg/Library/PeimEntryPoint". In particular, the function calls
>  ProcessLibraryConstructorList().
> 
> - The build tools generate the source code for
>  ProcessLibraryConstructorList(); based on the (recursively determined)
>  library instance dependency tree. (Search the Build subdirectory for
>  "AutoGen.c" files.) So, before the PEIM's actual entry point is
>  called, the lib instances' CONSTRUCTOR functions are called, in the
>  right order, so that the PEIM is entered with all libraries "ready to
>  use".
> 
> I guess that, when Tcg2Pei is dispatched for the 2nd time, from
> permanent RAM, the first (successful) GetFirstGuidHob() call that you
> see occurs like this, from generated code, as part of library
> construction. Some library instance's constructor function calls
> GetFirstGuidHob(), successfully.
> 
> The corruption could somehow be related to the HOB list's migration from
> temp RAM to permanent RAM. The OVMF debug log should contain something
> like this:
> 
>> Temp Stack : BaseAddress=0x818000 Length=0x8000
>> Temp Heap  : BaseAddress=0x810000 Length=0x8000
>> Total temporary memory:    65536 bytes.
>>  temporary memory stack ever used:       28656 bytes.
>>  temporary memory heap used for HobList: 6056 bytes.
>>  temporary memory heap occupied by memory pages: 0 bytes.
> 
> Implying that, before the temp RAM migration, the HOB list consumed ~6KB
> in total of the 32KB temp RAM heap. It seems unlikely that we run out of
> temp RAM heap.
> 
> Hopefully this helps you proceed with the debugging... Corrections are
> welcome too, obviously!
> 
> Thanks,
> Laszlo
Brian J. Johnson March 6, 2018, 12:45 a.m. UTC | #6
On 03/05/2018 12:22 PM, Laszlo Ersek wrote:
> PEIMs generally "execute in place" (XIP), i.e. they run from flash, not
> RAM. In this status they use "temporary RAM" (e.g. CPU caches configured
> like RAM) for stack & heap; whatever HOBs they produce are stored in
> "temp RAM" as well. Then one of the PEIMs "discovers permanent RAM"
> (basically it programs the memory controller and publishes the RAM
> ranges). In turn the PEI core "migrates" PEIMs from temporary to
> permanent RAM, including the HOB list.
> 
> Before the temporary RAM migration (when still executing in place from
> flash), PEIMs cannot have writeable global variables. For example,
> dynamic PCDs are also maintained in a HOB (the PCD HOB).
> 
> A PEIM normally cannot (and shouldn't) tell whether it is dispatched
> before or after permanent RAM is published. If needed, a PEIM can
> advertise that it depends on permanent RAM (for example because it needs
> a lot of heap memory) by including "gEfiPeiMemoryDiscoveredPpiGuid" in
> its DEPEX.
> 
> Finally, it seems like a PEIM can also express, "I'm fine with being
> dispatched from both flash (XIP) vs. permanent RAM, just the PEI core
> tell me whichever it is". Apparently, if the PEIM is dispatched from
> flash (before permanent RAM is available), its call to
> RegisterForShadow() returns EFI_SUCCESS (and then its entry point
> function will be invoked for a 2nd time, after the temp RAM migration).
> And when a PEIM is dispatched from RAM (either for the very first time,
> or for the second time, after being dispatched from flash first), the
> same call returns EFI_ALREADY_STARTED.
> 
> Honestly, I'm unsure what this is good for (both in general, and
> specifically for Tcg2Pei). Apparently, Tcg2Pei needs permanent RAM for
> doing the measurements (which makes sense); I just wonder what exactly
> it does so that it cannot simply specify
> "gEfiPeiMemoryDiscoveredPpiGuid" in its DEPEX.

I haven't looked at this particular PEIM.  But one case where 
registering for shadowing is useful is for improving performance when 
running from permanent RAM, where writable global variables are 
available.  For instance, when running from flash, a ReportStatusCode 
PEIM may need to go through a slow process to locate an output hardware 
device on every PPI call.  This may involve traversing the HOB list, 
consulting other PPIs, even probing hardware addresses.  But once it's 
shadowed to RAM, it can locate the device once, then cache its address 
in a global.  Not to mention that the code itself is far, far faster 
when run from RAM vs. flash.  (That's probably a key difference between 
a real machine and a VM.)

Also, I've personally written a PEIM which saves a bunch of internal 
state in a HOB, since that's the main writable storage when running from 
flash.  That state includes pointers to other data (in flash.)  Once the 
data is all shadowed to RAM, it updates the HOB to point to the data in 
RAM.  That way it's a lot faster to access the data.

I also have other PEIMs which are constrained (via DEPEX) to run *only* 
from RAM, since they have larger memory requirements than can be 
satisfied from temporary cache-as-RAM.  That's certainly a valid 
technique as well.

RegisterForShadow() is a useful tool for making the most of the 
restricted PEI environment.  And having it standardized like this is, as 
Andrew said, a lot better than the hacks people had to use beforehand.

Thanks,
Gao, Liming March 6, 2018, 2:02 a.m. UTC | #7
Laszlo:
  I also suggest to check the generated ProcessLibraryConstructorList () function. It is in the driver build output AutoGen.c code. You can check what library function be called in this function. Then, further add debug message in the library function. I suspect some function does the wrong operation and corrupt the memory. 

Thanks
Liming
> -----Original Message-----

> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Laszlo Ersek

> Sent: Tuesday, March 6, 2018 2:23 AM

> To: Marc-André Lureau <marcandre.lureau@gmail.com>; Andrew Fish <afish@apple.com>

> Cc: edk2-devel@lists.01.org; Peter Jones <pjones@redhat.com>; Yao, Jiewen <jiewen.yao@intel.com>; QEMU

> <qemu-devel@nongnu.org>; Javier Martinez Canillas <javierm@redhat.com>

> Subject: Re: [edk2] [PATCH 3/7] HACK: HobLib: workaround infinite loop

> 

> On 03/05/18 15:05, Marc-André Lureau wrote:

> > Hi

> >

> > On Fri, Feb 23, 2018 at 8:45 PM, Andrew Fish <afish@apple.com> wrote:

> >>

> >>

> >>> On Feb 23, 2018, at 5:23 AM, marcandre.lureau@redhat.com wrote:

> >>>

> >>> From: Marc-André Lureau <marcandre.lureau@redhat.com>

> >>>

> >>> Without this hack, GetNextHob() loops infinitely with the next

> >>> patch. I don't understand the reason.

> >>>

> >>> The loop is triggered by the GetFirstGuidHob (&gTpmErrorHobGuid)

> >>> call.

> >>>

> >>> CC: Laszlo Ersek <lersek@redhat.com>

> >>> CC: Stefan Berger <stefanb@linux.vnet.ibm.com>

> >>> Contributed-under: TianoCore Contribution Agreement 1.0

> >>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>

> >>> ---

> >>> MdePkg/Library/PeiHobLib/HobLib.c | 4 ++++

> >>> 1 file changed, 4 insertions(+)

> >>>

> >>> diff --git a/MdePkg/Library/PeiHobLib/HobLib.c b/MdePkg/Library/PeiHobLib/HobLib.c

> >>> index 5c0eeb992f..ed3c5fbd6d 100644

> >>> --- a/MdePkg/Library/PeiHobLib/HobLib.c

> >>> +++ b/MdePkg/Library/PeiHobLib/HobLib.c

> >>> @@ -89,6 +89,10 @@ GetNextHob (

> >>>     if (Hob.Header->HobType == Type) {

> >>>       return Hob.Raw;

> >>>     }

> >>> +    if (GET_HOB_LENGTH (HobStart) == 0) {

> >>

> >> As Laszlo points out this error condition is likely memory

> >> corruption. Thus it would be better to check for all know illegal

> >> values?

> >>

> >> if (GET_HOB_LENGTH(HobStart) < sizeof (EFI_HOB_GENERIC_HEADER)

> >>

> >

> > Thanks, I have adjusted the check.

> >

> > With manual calls and printf (I don't know  a better way to debug ovmf

> > ;),

> 

> Well that's how I generally debug it too :)

> 

> > I try to locate the issue. It's somehow related to

> > RegisterForShadow(). The "corruption" seems to happen during the

> > second call. After the

> > PeiLoadImage(...,PEIM_STATE_REGISTER_FOR_SHADOW,..), right before

> > calling PeimEntryPoint(), a GetFirstGuidHob() succeed, but inside the

> > function, it fails (with the same arguments). Right after it succeeds

> > again... The PeimEntryPoint() is not the Tcg2Pei:PeimEntryMA(), I

> > suppose there is some kind of wrapping code, but I fail to find where.

> > Any idea?

> 

> This sounds helpful. I don't know what the problem is, but I can

> elaborate on your details a bit; perhaps someone else will have more

> ideas.

> 

> Apparently there is a PEI service called RegisterForShadow().

> ("Apparently", because I've never seen, let alone written, a PEIM

> calling this service.) The service is specified in the PI spec, which is

> quoted in the edk2 tree [MdePkg/Include/Pi/PiPeiCis.h]:

> 

> > /**

> >   Register a PEIM so that it will be shadowed and called again.

> >

> >   This service registers a file handle so that after memory is

> >   available, the PEIM will be re-loaded into permanent memory and

> >   re-initialized. The PEIM registered this way will always be

> >   initialized twice. The first time, this function call will

> >   return EFI_SUCCESS. The second time, this function call will

> >   return EFI_ALREADY_STARTED. Depending on the order in which

> >   PEIMs are dispatched, the PEIM making this call may be

> >   initialized after permanent memory is installed, even the first

> >   time.

> >

> >   @param  FileHandle            PEIM's file handle. Must be the currently

> >                                 executing PEIM.

> >

> >   @retval EFI_SUCCESS           The PEIM was successfully registered for

> >                                 shadowing.

> >   @retval EFI_ALREADY_STARTED   The PEIM was previously

> >                                 registered for shadowing.

> >   @retval EFI_NOT_FOUND         The FileHandle does not refer to a

> >                                 valid file handle.

> >

> > **/

> > typedef

> > EFI_STATUS

> > (EFIAPI *EFI_PEI_REGISTER_FOR_SHADOW)(

> >   IN  EFI_PEI_FILE_HANDLE FileHandle

> >   );

> 

> PEIMs generally "execute in place" (XIP), i.e. they run from flash, not

> RAM. In this status they use "temporary RAM" (e.g. CPU caches configured

> like RAM) for stack & heap; whatever HOBs they produce are stored in

> "temp RAM" as well. Then one of the PEIMs "discovers permanent RAM"

> (basically it programs the memory controller and publishes the RAM

> ranges). In turn the PEI core "migrates" PEIMs from temporary to

> permanent RAM, including the HOB list.

> 

> Before the temporary RAM migration (when still executing in place from

> flash), PEIMs cannot have writeable global variables. For example,

> dynamic PCDs are also maintained in a HOB (the PCD HOB).

> 

> A PEIM normally cannot (and shouldn't) tell whether it is dispatched

> before or after permanent RAM is published. If needed, a PEIM can

> advertise that it depends on permanent RAM (for example because it needs

> a lot of heap memory) by including "gEfiPeiMemoryDiscoveredPpiGuid" in

> its DEPEX.

> 

> Finally, it seems like a PEIM can also express, "I'm fine with being

> dispatched from both flash (XIP) vs. permanent RAM, just the PEI core

> tell me whichever it is". Apparently, if the PEIM is dispatched from

> flash (before permanent RAM is available), its call to

> RegisterForShadow() returns EFI_SUCCESS (and then its entry point

> function will be invoked for a 2nd time, after the temp RAM migration).

> And when a PEIM is dispatched from RAM (either for the very first time,

> or for the second time, after being dispatched from flash first), the

> same call returns EFI_ALREADY_STARTED.

> 

> Honestly, I'm unsure what this is good for (both in general, and

> specifically for Tcg2Pei). Apparently, Tcg2Pei needs permanent RAM for

> doing the measurements (which makes sense); I just wonder what exactly

> it does so that it cannot simply specify

> "gEfiPeiMemoryDiscoveredPpiGuid" in its DEPEX.

> 

> I do see that the (!mImageInMemory) branch contains the TPM

> initialization code. However, as the PI spec itself says, it is not

> guaranteed that a PEIM will be dispatched from XIP (i.e., before

> permanent RAM) *at all*. So it's not clear to me how "business

> functionality" can depend on XIP.

> 

> 

> Now, OVMF is a bit different wrt. "memory controller programming" -- it

> runs on virtual platforms where programming the memory controllers is

> unnecessary. What happens is, instead, that only the SEC phase runs from

> flash (XIP), and it decompresses even the PEI firmware volume to normal

> RAM. The phase where PEIMs "run from flash" (XIP) still exists, of

> course, except they actually run from RAM -- so, for example, they have

> writeable global variables right off the bat. Perhaps this interferes

> somehow with the RegisterForShadow() service and/or how PEIMs expect

> that service to work.

> 

> Regarding the PEIM entry point -- the PEIMs' "own" entry point functions

> are always wrapped.

> 

> - The outermost (common) entry point function is called

>   _ModuleEntryPoint(). It is declared in

>   "MdePkg/Include/Library/PeimEntryPoint.h". This is what the PEI core

>   calls when dispatching a PEIM.

> 

> - Individual PEIMs add the PeimEntryPoint lib class to their INF files,

>   under [LibraryClasses]. The implementation is in

>   "MdePkg/Library/PeimEntryPoint". In particular, the function calls

>   ProcessLibraryConstructorList().

> 

> - The build tools generate the source code for

>   ProcessLibraryConstructorList(); based on the (recursively determined)

>   library instance dependency tree. (Search the Build subdirectory for

>   "AutoGen.c" files.) So, before the PEIM's actual entry point is

>   called, the lib instances' CONSTRUCTOR functions are called, in the

>   right order, so that the PEIM is entered with all libraries "ready to

>   use".

> 

> I guess that, when Tcg2Pei is dispatched for the 2nd time, from

> permanent RAM, the first (successful) GetFirstGuidHob() call that you

> see occurs like this, from generated code, as part of library

> construction. Some library instance's constructor function calls

> GetFirstGuidHob(), successfully.

> 

> The corruption could somehow be related to the HOB list's migration from

> temp RAM to permanent RAM. The OVMF debug log should contain something

> like this:

> 

> > Temp Stack : BaseAddress=0x818000 Length=0x8000

> > Temp Heap  : BaseAddress=0x810000 Length=0x8000

> > Total temporary memory:    65536 bytes.

> >   temporary memory stack ever used:       28656 bytes.

> >   temporary memory heap used for HobList: 6056 bytes.

> >   temporary memory heap occupied by memory pages: 0 bytes.

> 

> Implying that, before the temp RAM migration, the HOB list consumed ~6KB

> in total of the 32KB temp RAM heap. It seems unlikely that we run out of

> temp RAM heap.

> 

> Hopefully this helps you proceed with the debugging... Corrections are

> welcome too, obviously!

> 

> Thanks,

> Laszlo

> _______________________________________________

> edk2-devel mailing list

> edk2-devel@lists.01.org

> https://lists.01.org/mailman/listinfo/edk2-devel
Laszlo Ersek March 6, 2018, 8:38 a.m. UTC | #8
On 03/06/18 01:45, Brian J. Johnson wrote:
> On 03/05/2018 12:22 PM, Laszlo Ersek wrote:
>> PEIMs generally "execute in place" (XIP), i.e. they run from flash, not
>> RAM. In this status they use "temporary RAM" (e.g. CPU caches configured
>> like RAM) for stack & heap; whatever HOBs they produce are stored in
>> "temp RAM" as well. Then one of the PEIMs "discovers permanent RAM"
>> (basically it programs the memory controller and publishes the RAM
>> ranges). In turn the PEI core "migrates" PEIMs from temporary to
>> permanent RAM, including the HOB list.
>>
>> Before the temporary RAM migration (when still executing in place from
>> flash), PEIMs cannot have writeable global variables. For example,
>> dynamic PCDs are also maintained in a HOB (the PCD HOB).
>>
>> A PEIM normally cannot (and shouldn't) tell whether it is dispatched
>> before or after permanent RAM is published. If needed, a PEIM can
>> advertise that it depends on permanent RAM (for example because it needs
>> a lot of heap memory) by including "gEfiPeiMemoryDiscoveredPpiGuid" in
>> its DEPEX.
>>
>> Finally, it seems like a PEIM can also express, "I'm fine with being
>> dispatched from both flash (XIP) vs. permanent RAM, just the PEI core
>> tell me whichever it is". Apparently, if the PEIM is dispatched from
>> flash (before permanent RAM is available), its call to
>> RegisterForShadow() returns EFI_SUCCESS (and then its entry point
>> function will be invoked for a 2nd time, after the temp RAM migration).
>> And when a PEIM is dispatched from RAM (either for the very first time,
>> or for the second time, after being dispatched from flash first), the
>> same call returns EFI_ALREADY_STARTED.
>>
>> Honestly, I'm unsure what this is good for (both in general, and
>> specifically for Tcg2Pei). Apparently, Tcg2Pei needs permanent RAM for
>> doing the measurements (which makes sense); I just wonder what exactly
>> it does so that it cannot simply specify
>> "gEfiPeiMemoryDiscoveredPpiGuid" in its DEPEX.
> 
> I haven't looked at this particular PEIM.  But one case where
> registering for shadowing is useful is for improving performance when
> running from permanent RAM, where writable global variables are
> available.  For instance, when running from flash, a ReportStatusCode
> PEIM may need to go through a slow process to locate an output hardware
> device on every PPI call.  This may involve traversing the HOB list,
> consulting other PPIs, even probing hardware addresses.  But once it's
> shadowed to RAM, it can locate the device once, then cache its address
> in a global.  Not to mention that the code itself is far, far faster
> when run from RAM vs. flash.  (That's probably a key difference between
> a real machine and a VM.)

Ah, this sounds like a great example. Status code reporting is useful /
important for debugging even before permanent RAM is installed, so the
PEIM providing that PPI should not have a depex on
gEfiPeiMemoryDiscoveredPpiGuid. However, once the permanent RAM is
published, it makes sense to speed up the implementation. Thanks for
this example!

> Also, I've personally written a PEIM which saves a bunch of internal
> state in a HOB, since that's the main writable storage when running from
> flash.  That state includes pointers to other data (in flash.)  Once the
> data is all shadowed to RAM, it updates the HOB to point to the data in
> RAM.  That way it's a lot faster to access the data.

Another good example. (I think I've been generally blind to this perf
difference between flash and RAM, specifically concerning execution -- I
must have been spoiled by virt, as you say :) )

> I also have other PEIMs which are constrained (via DEPEX) to run *only*
> from RAM, since they have larger memory requirements than can be
> satisfied from temporary cache-as-RAM.  That's certainly a valid
> technique as well.

Right, that's quite known to OVMF too, due to CpuMpPei being "hungry"
like this.

> RegisterForShadow() is a useful tool for making the most of the
> restricted PEI environment.  And having it standardized like this is, as
> Andrew said, a lot better than the hacks people had to use beforehand.

I agree. I'm happy to have learned about this service!

Thanks all,
Laszlo
diff mbox

Patch

diff --git a/MdePkg/Library/PeiHobLib/HobLib.c b/MdePkg/Library/PeiHobLib/HobLib.c
index 5c0eeb992f..ed3c5fbd6d 100644
--- a/MdePkg/Library/PeiHobLib/HobLib.c
+++ b/MdePkg/Library/PeiHobLib/HobLib.c
@@ -89,6 +89,10 @@  GetNextHob (
     if (Hob.Header->HobType == Type) {
       return Hob.Raw;
     }
+    if (GET_HOB_LENGTH (HobStart) == 0) {
+        DEBUG ((DEBUG_INFO, "FIXME: GetNextHob length == 0"));
+        return NULL;
+    }
     Hob.Raw = GET_NEXT_HOB (Hob);
   }
   return NULL;