Message ID | 1455891343-10016-3-git-send-email-mcgrof@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Feb 19, 2016 at 6:15 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > Any failure on the x86 init path can be catastrophic. > A simple shift of a call from one place to another can > easily break things. Likewise adding a new call to > one path without considering all x86 requirements > can make certain x86 run time environments crash. > We currently account for these requirements through > peer code review and run time testing. We could do > much better if we had a clean and simple way to annotate > strong semantics for run time requirements, init sequence > dependencies, and detection mechanisms for additions of > new x86 init sequences. Please document this in a way that will be useful for people trying to understand what the code does as opposed to just people who are trying to understand why you wrote it. More below. > +/** > + * struct x86_init_fn - x86 generic kernel init call > + * > + * Linux x86 features vary in complexity, features may require work done at > + * different levels of the full x86 init sequence. Today there are also two > + * different possible entry points for Linux on x86, one for bare metal, KVM > + * and Xen HVM, and another for Xen PV guests / dom0. Assuming a bootloader > + * has set up 64-bit mode, roughly the x86 init sequence follows this path: > + * > + * Bare metal, KVM, Xen HVM Xen PV / dom0 > + * startup_64() startup_xen() > + * \ / > + * x86_64_start_kernel() xen_start_kernel() > + * \ / > + * x86_64_start_reservations() > + * | > + * start_kernel() > + * [ ... ] > + * [ setup_arch() ] > + * [ ... ] > + * init > + * I don't think this paragraph below is necessary. I also think it's very confusing. Keep in mind that the reader has no idea what a "level" is at this point and that the reader also doesn't need to think about terms like "paravirtualization yielding". > + * x86_64_start_kernel() and xen_start_kernel() are the respective first C code > + * entry starting points. The different entry points exist to enable Xen to > + * skip a lot of hardware setup already done and managed on behalf of the > + * hypervisor, we refer to this as "paravirtualization yielding". The different > + * levels of init calls on the x86 init sequence exist to account for these > + * slight differences and requirements. These different entry points also share > + * a common entry x86 specific path, x86_64_start_reservations(). > + * And here, I don't even know what a "feature" is. > + * A generic x86 feature can have different initialization calls, one on each > + * of the different main x86 init sequences, but must also address both entry > + * points in order to work properly across the board on all supported x86 > + * subarchitectures. Since x86 features can also have dependencies on other > + * setup code or features, x86 features can at times be subordinate to other > + * x86 features, or conditions. struct x86_init_fn enables feature developers > + * to annotate dependency relationships to ensure subsequent init calls only > + * run once a subordinate's dependencies have run. When needed custom > + * dependency requirements can also be spelled out through a custom dependency > + * checker. In order to account for the dual entry point nature of x86-64 Linux > + * for "paravirtualization yielding" and to make annotations for support for > + * these explicit each struct x86_init_fn must specify supported > + * subarchitectures. The earliest x86-64 code can read the subarchitecture > + * though is after load_idt(), as such the earliest we can currently rely on > + * subarchitecture for semantics and a common init sequences is on the shared > + * common x86_64_start_reservations(). Each struct x86_init_fn must also > + * declare a two-digit decimal number to impose an ordering relative to other > + * features when required. I'm totally lost in the paragraph above. > + * > + * x86_init_fn enables strong semantics and dependencies to be defined and > + * implemented on the full x86 initialization sequence. Please try explaining what this is, instead. For example: An x86_init_fn represents a function to be called at a certain point during initialization on a specific set of subarchitectures. > + * > + * @order_level: must be set, linker order level, this corresponds to the table > + * section sub-table index, we record this only for semantic validation > + * purposes. I read this as "this is purely a debugging feature". Is it? I think it's not. > Order-level is always required however you typically would > + * only use X86_INIT_NORMAL*() and leave ordering to be done by placement > + * of code in a C file and the order of objects through a Makefile. Custom > + * order-levels can be used when order on C file and order of objects on > + * Makfiles does not suffice or much further refinements are needed. Assuming I understand this correctly, here's how I'd write it: @order_level: The temporal order of this x86_init_fn. Lower order_level numbers are called first. Ties are broken by order found in the Makefile and then by order in the C file. Note, however, that my proposed explanation can't be right because it appears to conflict with "depend". Please adjust accordingly. > + * @supp_hardware_subarch: must be set, it represents the bitmask of supported > + * subarchitectures. We require each struct x86_init_fn to have this set > + * to require developer considerations for each supported x86 > + * subarchitecture and to build strong annotations of different possible > + * run time states particularly in consideration for the two main > + * different entry points for x86 Linux, to account for paravirtualization > + * yielding. ' Too much motivation, too little documentation. @supp_hardware_subarch: A bitmask of subarchitectures on which to call this init function. --- start big deletion --- > + * > + * The subarchitecture is read by the kernel at early boot from the > + * struct boot_params hardware_subarch. Support for the subarchitecture > + * exists as of x86 boot protocol 2.07. The bootloader would have set up > + * the respective hardware_subarch on the boot sector as per > + * Documentation/x86/boot.txt. > + * > + * What x86 entry point is used is determined at run time by the > + * bootloader. Linux pv_ops was designed to help enable to build one Linux > + * binary to support bare metal and different hypervisors. pv_ops setup > + * code however is limited in that all pv_ops setup code is run late in > + * the x86 init sequence, during setup_arch(). In fact cpu_has_hypervisor > + * only works after early_cpu_init() during setup_arch(). If an x86 > + * feature requires an earlier determination of what hypervisor was used, > + * or if it needs to annotate only support for certain hypervisors, the > + * x86 hardware_subarch should be set by the bootloader and > + * @supp_hardware_subarch set by the x86 feature. Using hardware_subarch > + * enables x86 features to fill the semantic gap between the Linux x86 > + * entry point used and what pv_ops has to offer through a hypervisor > + * agnostic mechanism. --- end big deletion --- > + * > + * Each supported subarchitecture is set using the respective > + * X86_SUBARCH_* as a bit in the bitmask. For instance if a feature > + * is supported on PC and Xen subarchitectures only you would set this > + * bitmask to: > + * > + * BIT(X86_SUBARCH_PC) | > + * BIT(X86_SUBARCH_XEN); I like this part, but how about "For instance, if an init function should be called on PC and Xen subarchitectures only, you would set the bitmask to..."? > + * > + * @detect: optional, if set returns true if the feature has been detected to > + * be required, it returns false if the feature has been detected to not > + * be required. I have absolutely no idea what this means. > + * @depend: optional, if set this set of init routines must be called prior to > + * the init routine who's respective detect routine we have set this > + * depends callback to. This is only used for sorting purposes given > + * all current init callbacks have a void return type. Sorting is > + * implemented via x86_init_fn_sort(), it must be called only once, > + * however you can delay sorting until you need it if you can ensure > + * only @order_level and @supp_hardware_subarch can account for proper > + * ordering and dependency requirements for all init sequences prior. > + * If you do not have a depend callback set its assumed the order level > + * (__x86_init_fn(level)) set by the init routine suffices to set the > + * order for when the feature's respective callbacks are called with > + * respect to other calls. Sorting of init calls with the same order level > + * is determined by linker order, determined by order placement on C code > + * and order listed on a Makefile. A routine that depends on another is > + * known as being subordinate to the init routine it depends on. Routines > + * that are subordinate must have an order-level of lower priority or > + * equal priority than the order-level of the init sequence it depends on. I don't understand this at all. I assume you're saying that some kind of topological sorting happens. This leads to a question: why is "depend" a function pointer? Shouldn't it be x86_init_fn*? Also, what happens if you depend on something that is disabled on the running subarch? And what happens if the depend-implied order is inconsistent with order_level. I would hope that order_level breaks ties in the topological sort and that there's a bit fat warning if the order_level ordering is inconsistent with the topological ordering. Preferably the warning would be at build time, in which case it could be an error. > + * @early_init: required, routine which will run in x86_64_start_reservations() > + * after we ensure boot_params.hdr.hardware_subarch is accessible and > + * properly set. Memory is not yet available. This the earliest we can > + * currently define a common shared callback since all callbacks need to > + * check for boot_params.hdr.hardware_subarch and this becomes accessible > + * on x86-64 until after load_idt(). What's this for? Under what conditions is it called? What order? Why is it part of x86_init_fn at all? > + * @flags: optional, bitmask of enum x86_init_fn_flags What are these flags? What do they do? > + */ > +struct x86_init_fn { > + __u32 order_level; > + __u32 supp_hardware_subarch; > + bool (*detect)(void); > + bool (*depend)(void); > + void (*early_init)(void); > + __u32 flags; > +}; > + > +/** > + * enum x86_init_fn_flags: flags for init sequences > + * > + * X86_INIT_FINISH_IF_DETECTED: tells the core that once this init sequence > + * has completed it can break out of the loop for init sequences on > + * its own level. What does this mean? > + * X86_INIT_DETECTED: private flag. Used by the x86 core to annotate that this > + * init sequence has been detected and it all of its callbacks > + * must be run during initialization. Please make this an entry in a new field scratch_space or similar and just document the entire field as "private to the init core code". > +static struct x86_init_fn *x86_init_fn_find_dep(struct x86_init_fn *start, > + struct x86_init_fn *finish, > + struct x86_init_fn *q) > +{ > + struct x86_init_fn *p; > + > + if (!q) > + return NULL; > + > + for (p = start; p < finish; p++) > + if (p->detect == q->depend) > + return p; That's very strange indeed, and it doesn't seem consistent with my explanation. Please fix up the docs to explain what's going on. Again, as a reviewer and eventual user of this code, I do not need to know what historical problem it solves. I need to know what the code does and how to use it. > + > +void __ref x86_init_fn_init_tables(void) Why is this __ref and not __init? --Andy
On Fri, Feb 19, 2016 at 08:40:49AM -0800, Andy Lutomirski wrote: > On Fri, Feb 19, 2016 at 6:15 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > > Any failure on the x86 init path can be catastrophic. > > A simple shift of a call from one place to another can > > easily break things. Likewise adding a new call to > > one path without considering all x86 requirements > > can make certain x86 run time environments crash. > > We currently account for these requirements through > > peer code review and run time testing. We could do > > much better if we had a clean and simple way to annotate > > strong semantics for run time requirements, init sequence > > dependencies, and detection mechanisms for additions of > > new x86 init sequences. > > Please document this in a way that will be useful for people trying to > understand what the code does as opposed to just people who are trying > to understand why you wrote it. More below. Sure. I'm kind of glad people are getting tired of me trying to explain *why* I wrote it though, I figured that might be the harder thing to explain. Hopefully it sinks in. > > +/** > > + * struct x86_init_fn - x86 generic kernel init call > > + * > > + * Linux x86 features vary in complexity, features may require work done at > > + * different levels of the full x86 init sequence. Today there are also two > > + * different possible entry points for Linux on x86, one for bare metal, KVM > > + * and Xen HVM, and another for Xen PV guests / dom0. Assuming a bootloader > > + * has set up 64-bit mode, roughly the x86 init sequence follows this path: > > + * > > + * Bare metal, KVM, Xen HVM Xen PV / dom0 > > + * startup_64() startup_xen() > > + * \ / > > + * x86_64_start_kernel() xen_start_kernel() > > + * \ / > > + * x86_64_start_reservations() > > + * | > > + * start_kernel() > > + * [ ... ] > > + * [ setup_arch() ] > > + * [ ... ] > > + * init > > + * > > > I don't think this paragraph below is necessary. I also think it's > very confusing. Keep in mind that the reader has no idea what a > "level" is at this point and that the reader also doesn't need to > think about terms like "paravirtualization yielding". > > > + * x86_64_start_kernel() and xen_start_kernel() are the respective first C code > > + * entry starting points. The different entry points exist to enable Xen to > > + * skip a lot of hardware setup already done and managed on behalf of the > > + * hypervisor, we refer to this as "paravirtualization yielding". The different > > + * levels of init calls on the x86 init sequence exist to account for these > > + * slight differences and requirements. These different entry points also share > > + * a common entry x86 specific path, x86_64_start_reservations(). OK sure, I'll nuke. > > + * > > And here, I don't even know what a "feature" is. Kasan is an example. > > + * A generic x86 feature can have different initialization calls, one on each > > + * of the different main x86 init sequences, but must also address both entry > > + * points in order to work properly across the board on all supported x86 > > + * subarchitectures. Since x86 features can also have dependencies on other > > + * setup code or features, x86 features can at times be subordinate to other > > + * x86 features, or conditions. struct x86_init_fn enables feature developers > > + * to annotate dependency relationships to ensure subsequent init calls only > > + * run once a subordinate's dependencies have run. When needed custom > > + * dependency requirements can also be spelled out through a custom dependency > > + * checker. In order to account for the dual entry point nature of x86-64 Linux > > + * for "paravirtualization yielding" and to make annotations for support for > > + * these explicit each struct x86_init_fn must specify supported > > + * subarchitectures. The earliest x86-64 code can read the subarchitecture > > + * though is after load_idt(), as such the earliest we can currently rely on > > + * subarchitecture for semantics and a common init sequences is on the shared > > + * common x86_64_start_reservations(). Each struct x86_init_fn must also > > + * declare a two-digit decimal number to impose an ordering relative to other > > + * features when required. > > I'm totally lost in the paragraph above. As an example Kasan could be defined as a struct x86_init_fn with a respective callback on x86_64_start_kernel() (kasan_early_init()) and later setup_arch() (kasan_init()). Since we can't support for struct x86_init_fn calls to start all the way from x86_64_start_kernel() (given the issues I had noted before) the earliest a struct x86_init_fn can be used for is for a callback on x86_64_start_reservations(). It just means we can different callbacks used for different parts of the x86 init process, it explains the limits to the callbacks we can have right now. > > + * > > + * x86_init_fn enables strong semantics and dependencies to be defined and > > + * implemented on the full x86 initialization sequence. > > Please try explaining what this is, instead. For example: > > An x86_init_fn represents a function to be called at a certain point > during initialization on a specific set of subarchitectures. OK thanks. > > + * > > + * @order_level: must be set, linker order level, this corresponds to the table > > + * section sub-table index, we record this only for semantic validation > > + * purposes. > > I read this as "this is purely a debugging feature". Is it? I think it's not. Nope, the order_level really is used for linker table sorting, but since we can modify the table for these structs with a run time sort later, we want to ensure that the order level is still respected after we sort at run time. > > Order-level is always required however you typically would > > + * only use X86_INIT_NORMAL*() and leave ordering to be done by placement > > + * of code in a C file and the order of objects through a Makefile. Custom > > + * order-levels can be used when order on C file and order of objects on > > + * Makfiles does not suffice or much further refinements are needed. > > > Assuming I understand this correctly, here's how I'd write it: > > @order_level: The temporal order of this x86_init_fn. Lower > order_level numbers are called first. Ties are broken by order found > in the Makefile and then by order in the C file. Thanks... yes. > > Note, however, that my proposed explanation can't be right because it > appears to conflict with "depend". Please adjust accordingly. > > > + * @supp_hardware_subarch: must be set, it represents the bitmask of supported > > + * subarchitectures. We require each struct x86_init_fn to have this set > > + * to require developer considerations for each supported x86 > > + * subarchitecture and to build strong annotations of different possible > > + * run time states particularly in consideration for the two main > > + * different entry points for x86 Linux, to account for paravirtualization > > + * yielding. > ' > Too much motivation, too little documentation. > > @supp_hardware_subarch: A bitmask of subarchitectures on which to call > this init function. OK thanks. > --- start big deletion --- > > > + * > > + * The subarchitecture is read by the kernel at early boot from the > > + * struct boot_params hardware_subarch. Support for the subarchitecture > > + * exists as of x86 boot protocol 2.07. The bootloader would have set up > > + * the respective hardware_subarch on the boot sector as per > > + * Documentation/x86/boot.txt. > > + * > > + * What x86 entry point is used is determined at run time by the > > + * bootloader. Linux pv_ops was designed to help enable to build one Linux > > + * binary to support bare metal and different hypervisors. pv_ops setup > > + * code however is limited in that all pv_ops setup code is run late in > > + * the x86 init sequence, during setup_arch(). In fact cpu_has_hypervisor > > + * only works after early_cpu_init() during setup_arch(). If an x86 > > + * feature requires an earlier determination of what hypervisor was used, > > + * or if it needs to annotate only support for certain hypervisors, the > > + * x86 hardware_subarch should be set by the bootloader and > > + * @supp_hardware_subarch set by the x86 feature. Using hardware_subarch > > + * enables x86 features to fill the semantic gap between the Linux x86 > > + * entry point used and what pv_ops has to offer through a hypervisor > > + * agnostic mechanism. > > --- end big deletion --- OK! > > + * > > + * Each supported subarchitecture is set using the respective > > + * X86_SUBARCH_* as a bit in the bitmask. For instance if a feature > > + * is supported on PC and Xen subarchitectures only you would set this > > + * bitmask to: > > + * > > + * BIT(X86_SUBARCH_PC) | > > + * BIT(X86_SUBARCH_XEN); > > I like this part, but how about "For instance, if an init function > should be called on PC and Xen subarchitectures only, you would set > the bitmask to..."? Sure. > > + * > > + * @detect: optional, if set returns true if the feature has been detected to > > + * be required, it returns false if the feature has been detected to not > > + * be required. > > I have absolutely no idea what this means. Well, so the order level stuff is used at link time, there however are some run time environment conditions that might be important to consider as well, the detect lets you write a C code routine which will check to see if the feature should run, it returns false if the feature should not run or is not required to run. A cheesy way to implement a cheesy graph is to then use the same callback as a pointer for dependency, so that if a feature depends on another, both are on the same order level, and if they have a run time dependency annotation, they can further annotate this relationship by having the thing that depends, set its depends pointer to refer to the other feature's detect callback. This is borrowed from the IOMMU init code in the kernel. Same logic. Only thing is we also get link order support too. Link order semantics and logic should suffice for most things, but not all as with the IOMMU init stuff. > > + * @depend: optional, if set this set of init routines must be called prior to > > + * the init routine who's respective detect routine we have set this > > + * depends callback to. This is only used for sorting purposes given > > + * all current init callbacks have a void return type. Sorting is > > + * implemented via x86_init_fn_sort(), it must be called only once, > > + * however you can delay sorting until you need it if you can ensure > > + * only @order_level and @supp_hardware_subarch can account for proper > > + * ordering and dependency requirements for all init sequences prior. > > + * If you do not have a depend callback set its assumed the order level > > + * (__x86_init_fn(level)) set by the init routine suffices to set the > > + * order for when the feature's respective callbacks are called with > > + * respect to other calls. Sorting of init calls with the same order level > > + * is determined by linker order, determined by order placement on C code > > + * and order listed on a Makefile. A routine that depends on another is > > + * known as being subordinate to the init routine it depends on. Routines > > + * that are subordinate must have an order-level of lower priority or > > + * equal priority than the order-level of the init sequence it depends on. > > I don't understand this at all. I assume you're saying that some kind > of topological sorting happens. This leads to a question: why is > "depend" a function pointer? Shouldn't it be x86_init_fn*? That's fine too. It was just borrowing the implementation from the iommu solution. I've modified that considerably so could not just extend it. After some more though I also considered generalizing it but decided against it given that run time sorting can depend on very custom semantics and most importantly data structure size. > Also, > what happens if you depend on something that is disabled on the > running subarch? We have a flag that annotates if a callback runs, if the flag is not set then a depends thing will ensure to not call the routine that has depends set. > And what happens if the depend-implied order is > inconsistent with order_level. Ah -- that's what I was trying to explain above in the comment about inconsistent state -- a run time check *after* we do run time sorting goes through and checks for sanity. All this is a light weight feature graph, if you will, other folks are working on different solutions for this and while I have considered sharing a solution for early init code this seems a bit more ideal for now. Later in the future though we should probably get together and talk about a proper feature graph, with more consistent checks, etc. Its another reason why I am also curious to see a SAT solver at run time in the future, but this is super long term [0]... [0] http://kernelnewbies.org/KernelProjects/kconfig-sat > I would hope that order_level breaks ties in the topological sort and > that there's a bit fat warning if the order_level ordering is > inconsistent with the topological ordering. Preferably the warning > would be at build time, in which case it could be an error. Order level will always be respected at link time. The depend and detect thing are heuristics used at run time and since it depends on run time state we can't do this at build time. The big fat hairy nasty warning is there though. Curious enough I ended up finding out that the sanity check for IOMMU was intended to only be run at debug time but the debug thing was *always* set to true, so we've been running the sanity checker in the kernel for ages. Its why I also feel a bit more comfortable in keeping it live and matching more close to its implementation. I've gone to pains to build off of the IOMMU init solution and simply extend on the semantics there. Turns out there was much room for enhancements. All the fine tuning are documented in the userspace tree. > > + * @early_init: required, routine which will run in x86_64_start_reservations() > > + * after we ensure boot_params.hdr.hardware_subarch is accessible and > > + * properly set. Memory is not yet available. This the earliest we can > > + * currently define a common shared callback since all callbacks need to > > + * check for boot_params.hdr.hardware_subarch and this becomes accessible > > + * on x86-64 until after load_idt(). > > What's this for? Under what conditions is it called? What order? > Why is it part of x86_init_fn at all? Its one of the reasons we are adding this, callbacks to for features. Its called if the its requirements are met (subarch, and depends). I'll work on extending this to be clearer. > > + * @flags: optional, bitmask of enum x86_init_fn_flags > > What are these flags? What do they do? They tell us the supported subarchs that must be met matched at run time in order for the feature's callback to run. > > + */ > > +struct x86_init_fn { > > + __u32 order_level; > > + __u32 supp_hardware_subarch; > > + bool (*detect)(void); > > + bool (*depend)(void); > > + void (*early_init)(void); > > + __u32 flags; > > +}; > > + > > +/** > > + * enum x86_init_fn_flags: flags for init sequences > > + * > > + * X86_INIT_FINISH_IF_DETECTED: tells the core that once this init sequence > > + * has completed it can break out of the loop for init sequences on > > + * its own level. > > What does this mean? It borrows the logic from the IOMMU stuff, it enables the core to detect if *any* routine in the linker table has completed we can then now just bail and stop going through the rest of the linker table. > > + * X86_INIT_DETECTED: private flag. Used by the x86 core to annotate that this > > + * init sequence has been detected and it all of its callbacks > > + * must be run during initialization. > > > Please make this an entry in a new field scratch_space or similar and > just document the entire field as "private to the init core code". OK. > > +static struct x86_init_fn *x86_init_fn_find_dep(struct x86_init_fn *start, > > + struct x86_init_fn *finish, > > + struct x86_init_fn *q) > > +{ > > + struct x86_init_fn *p; > > + > > + if (!q) > > + return NULL; > > + > > + for (p = start; p < finish; p++) > > + if (p->detect == q->depend) > > + return p; > > That's very strange indeed, and it doesn't seem consistent with my > explanation. Please fix up the docs to explain what's going on. > Again, as a reviewer and eventual user of this code, I do not need to > know what historical problem it solves. I need to know what the code > does and how to use it. Understood. To help please note that this borrows from the IOMMU stuff, which has even less documentation. I was trying to actually put some good effort to fix that for our sake and to enhance the semantics further. If things seem confusing just keep in mind this is the IOMMU init solution with with enhanced semantics and linker table support. > > + > > +void __ref x86_init_fn_init_tables(void) > > Why is this __ref and not __init? Yeah __init should work better, thanks. Luis
On Fri, Feb 19, 2016 at 4:42 PM, Luis R. Rodriguez <mcgrof@kernl.org> wrote: > On Fri, Feb 19, 2016 at 08:40:49AM -0800, Andy Lutomirski wrote: >> On Fri, Feb 19, 2016 at 6:15 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: >> > Any failure on the x86 init path can be catastrophic. >> > A simple shift of a call from one place to another can >> > easily break things. Likewise adding a new call to >> > one path without considering all x86 requirements >> > can make certain x86 run time environments crash. >> > We currently account for these requirements through >> > peer code review and run time testing. We could do >> > much better if we had a clean and simple way to annotate >> > strong semantics for run time requirements, init sequence >> > dependencies, and detection mechanisms for additions of >> > new x86 init sequences. >> >> Please document this in a way that will be useful for people trying to >> understand what the code does as opposed to just people who are trying >> to understand why you wrote it. More below. > > Sure. I'm kind of glad people are getting tired of me trying to > explain *why* I wrote it though, I figured that might be the harder > thing to explain. Hopefully it sinks in. > >> > +/** >> > + * struct x86_init_fn - x86 generic kernel init call >> > + * >> > + * Linux x86 features vary in complexity, features may require work done at >> > + * different levels of the full x86 init sequence. Today there are also two >> > + * different possible entry points for Linux on x86, one for bare metal, KVM >> > + * and Xen HVM, and another for Xen PV guests / dom0. Assuming a bootloader >> > + * has set up 64-bit mode, roughly the x86 init sequence follows this path: >> > + * >> > + * Bare metal, KVM, Xen HVM Xen PV / dom0 >> > + * startup_64() startup_xen() >> > + * \ / >> > + * x86_64_start_kernel() xen_start_kernel() >> > + * \ / >> > + * x86_64_start_reservations() >> > + * | >> > + * start_kernel() >> > + * [ ... ] >> > + * [ setup_arch() ] >> > + * [ ... ] >> > + * init >> > + * >> >> >> I don't think this paragraph below is necessary. I also think it's >> very confusing. Keep in mind that the reader has no idea what a >> "level" is at this point and that the reader also doesn't need to >> think about terms like "paravirtualization yielding". >> >> > + * x86_64_start_kernel() and xen_start_kernel() are the respective first C code >> > + * entry starting points. The different entry points exist to enable Xen to >> > + * skip a lot of hardware setup already done and managed on behalf of the >> > + * hypervisor, we refer to this as "paravirtualization yielding". The different >> > + * levels of init calls on the x86 init sequence exist to account for these >> > + * slight differences and requirements. These different entry points also share >> > + * a common entry x86 specific path, x86_64_start_reservations(). > > OK sure, I'll nuke. > >> > + * >> >> And here, I don't even know what a "feature" is. > > Kasan is an example. Is kasan different from the cr4 shadow in this context? I don't really feel like calling the cr4 shadow a "feature" is meaningful. Am I misunderstanding something? > >> > + * A generic x86 feature can have different initialization calls, one on each >> > + * of the different main x86 init sequences, but must also address both entry >> > + * points in order to work properly across the board on all supported x86 >> > + * subarchitectures. Since x86 features can also have dependencies on other >> > + * setup code or features, x86 features can at times be subordinate to other >> > + * x86 features, or conditions. struct x86_init_fn enables feature developers >> > + * to annotate dependency relationships to ensure subsequent init calls only >> > + * run once a subordinate's dependencies have run. When needed custom >> > + * dependency requirements can also be spelled out through a custom dependency >> > + * checker. In order to account for the dual entry point nature of x86-64 Linux >> > + * for "paravirtualization yielding" and to make annotations for support for >> > + * these explicit each struct x86_init_fn must specify supported >> > + * subarchitectures. The earliest x86-64 code can read the subarchitecture >> > + * though is after load_idt(), as such the earliest we can currently rely on >> > + * subarchitecture for semantics and a common init sequences is on the shared >> > + * common x86_64_start_reservations(). Each struct x86_init_fn must also >> > + * declare a two-digit decimal number to impose an ordering relative to other >> > + * features when required. >> >> I'm totally lost in the paragraph above. > > As an example Kasan could be defined as a struct x86_init_fn with a respective > callback on x86_64_start_kernel() (kasan_early_init()) and later setup_arch() > (kasan_init()). Since we can't support for struct x86_init_fn calls to start all > the way from x86_64_start_kernel() (given the issues I had noted before) > the earliest a struct x86_init_fn can be used for is for a callback on > x86_64_start_reservations(). I'm still lost. Kasan is a pile of code that does something. Kasan has some initialization code. Do you mean that kasan's initialization code could be encapsulated in one or more x86_init_fn instances? > > It just means we can different callbacks used for different parts of the > x86 init process, it explains the limits to the callbacks we can have right now. Do you mean that there would be different linker tables containing x86_init_fn instances for each of these callback points? Or is "early_init" the only such callback point. BTW, why is early_init called early_init? Why not call it "callback"? Then you could reuse this for other things. > >> > + * >> > + * x86_init_fn enables strong semantics and dependencies to be defined and >> > + * implemented on the full x86 initialization sequence. >> >> Please try explaining what this is, instead. For example: >> >> An x86_init_fn represents a function to be called at a certain point >> during initialization on a specific set of subarchitectures. > > OK thanks. > >> > + * >> > + * @order_level: must be set, linker order level, this corresponds to the table >> > + * section sub-table index, we record this only for semantic validation >> > + * purposes. >> >> I read this as "this is purely a debugging feature". Is it? I think it's not. > > Nope, the order_level really is used for linker table sorting, but since we can > modify the table for these structs with a run time sort later, we want to ensure > that the order level is still respected after we sort at run time. > >> > Order-level is always required however you typically would >> > + * only use X86_INIT_NORMAL*() and leave ordering to be done by placement >> > + * of code in a C file and the order of objects through a Makefile. Custom >> > + * order-levels can be used when order on C file and order of objects on >> > + * Makfiles does not suffice or much further refinements are needed. >> >> >> Assuming I understand this correctly, here's how I'd write it: >> >> @order_level: The temporal order of this x86_init_fn. Lower >> order_level numbers are called first. Ties are broken by order found >> in the Makefile and then by order in the C file. > > Thanks... yes. > >> >> Note, however, that my proposed explanation can't be right because it >> appears to conflict with "depend". Please adjust accordingly. >> >> > + * @supp_hardware_subarch: must be set, it represents the bitmask of supported >> > + * subarchitectures. We require each struct x86_init_fn to have this set >> > + * to require developer considerations for each supported x86 >> > + * subarchitecture and to build strong annotations of different possible >> > + * run time states particularly in consideration for the two main >> > + * different entry points for x86 Linux, to account for paravirtualization >> > + * yielding. >> ' >> Too much motivation, too little documentation. >> >> @supp_hardware_subarch: A bitmask of subarchitectures on which to call >> this init function. > > OK thanks. > > >> > + * >> > + * @detect: optional, if set returns true if the feature has been detected to >> > + * be required, it returns false if the feature has been detected to not >> > + * be required. >> >> I have absolutely no idea what this means. > > Well, so the order level stuff is used at link time, there however are some > run time environment conditions that might be important to consider as well, > the detect lets you write a C code routine which will check to see if the > feature should run, it returns false if the feature should not run or is > not required to run. So how is: static bool detect_foo(void) { return whatever; } statis void init_foo(void) { do something; } .detect = detect_foo, .early_init = init_foo, different from: static void init_foo(void) { if (whatever) do something; } .early_init = init_foo, > > A cheesy way to implement a cheesy graph is to then use the same callback > as a pointer for dependency, so that if a feature depends on another, both > are on the same order level, and if they have a run time dependency > annotation, they can further annotate this relationship by having the > thing that depends, set its depends pointer to refer to the other > feature's detect callback. > > This is borrowed from the IOMMU init code in the kernel. Same logic. > Only thing is we also get link order support too. Link order semantics > and logic should suffice for most things, but not all as with the IOMMU > init stuff. But this doesn't work at all if .detect isn't set. >> And what happens if the depend-implied order is >> inconsistent with order_level. > > Ah -- that's what I was trying to explain above in the comment > about inconsistent state -- a run time check *after* we do run > time sorting goes through and checks for sanity. > > All this is a light weight feature graph, if you will, other > folks are working on different solutions for this and while > I have considered sharing a solution for early init code this > seems a bit more ideal for now. Later in the future though we > should probably get together and talk about a proper feature > graph, with more consistent checks, etc. Its another reason > why I am also curious to see a SAT solver at run time in the > future, but this is super long term [0]... > > [0] http://kernelnewbies.org/KernelProjects/kconfig-sat > >> I would hope that order_level breaks ties in the topological sort and >> that there's a bit fat warning if the order_level ordering is >> inconsistent with the topological ordering. Preferably the warning >> would be at build time, in which case it could be an error. > > Order level will always be respected at link time. The depend > and detect thing are heuristics used at run time and since > it depends on run time state we can't do this at build time. > The big fat hairy nasty warning is there though. Curious > enough I ended up finding out that the sanity check for > IOMMU was intended to only be run at debug time but the > debug thing was *always* set to true, so we've been running > the sanity checker in the kernel for ages. Its why I also > feel a bit more comfortable in keeping it live and matching > more close to its implementation. I've gone to pains to > build off of the IOMMU init solution and simply extend > on the semantics there. Turns out there was much room for > enhancements. All the fine tuning are documented in the > userspace tree. So is this what you're saying: Callbacks are called in an order consistent with the dependencies specified by @depend. Ties are broken first by @order_level and then by linker order. It is an error for an x86_init_fn with lower @order_level to depend on an x86_init_fn with higher @order_level. > >> > + * @early_init: required, routine which will run in x86_64_start_reservations() >> > + * after we ensure boot_params.hdr.hardware_subarch is accessible and >> > + * properly set. Memory is not yet available. This the earliest we can >> > + * currently define a common shared callback since all callbacks need to >> > + * check for boot_params.hdr.hardware_subarch and this becomes accessible >> > + * on x86-64 until after load_idt(). >> >> What's this for? Under what conditions is it called? What order? >> Why is it part of x86_init_fn at all? > > Its one of the reasons we are adding this, callbacks to for features. > Its called if the its requirements are met (subarch, and depends). > I'll work on extending this to be clearer. I think this should just be ".callback". Let the context by implied by the linker table in which it appears. . > >> > + */ >> > +struct x86_init_fn { >> > + __u32 order_level; >> > + __u32 supp_hardware_subarch; >> > + bool (*detect)(void); >> > + bool (*depend)(void); >> > + void (*early_init)(void); >> > + __u32 flags; >> > +}; >> > + >> > +/** >> > + * enum x86_init_fn_flags: flags for init sequences >> > + * >> > + * X86_INIT_FINISH_IF_DETECTED: tells the core that once this init sequence >> > + * has completed it can break out of the loop for init sequences on >> > + * its own level. >> >> What does this mean? > > It borrows the logic from the IOMMU stuff, it enables the core to detect > if *any* routine in the linker table has completed we can then now just > bail and stop going through the rest of the linker table. Can you give an example for which this would be used? --Andy
On Fri, Feb 19, 2016 at 11:30:43PM -0800, Andy Lutomirski wrote: > On Fri, Feb 19, 2016 at 4:42 PM, Luis R. Rodriguez <mcgrof@kernl.org> wrote: > > On Fri, Feb 19, 2016 at 08:40:49AM -0800, Andy Lutomirski wrote: > >> On Fri, Feb 19, 2016 at 6:15 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > >> And here, I don't even know what a "feature" is. > > > > Kasan is an example. > > Is kasan different from the cr4 shadow in this context? I don't > really feel like calling the cr4 shadow a "feature" is meaningful. Am > I misunderstanding something? Its an early boot call that we can share with the different boot entries, or that each boot entry must call. Kasan is different in that is has a series of requirements for each boot entry, it needs an early boot call, then one at setup_arch(). In the context of how I'm implementing callbacks just as an example, cr4 shadow would just have one callback, whereas kasan could have an early_init() callback and a setup_arch() callback. > >> > + * A generic x86 feature can have different initialization calls, one on each > >> > + * of the different main x86 init sequences, but must also address both entry > >> > + * points in order to work properly across the board on all supported x86 > >> > + * subarchitectures. Since x86 features can also have dependencies on other > >> > + * setup code or features, x86 features can at times be subordinate to other > >> > + * x86 features, or conditions. struct x86_init_fn enables feature developers > >> > + * to annotate dependency relationships to ensure subsequent init calls only > >> > + * run once a subordinate's dependencies have run. When needed custom > >> > + * dependency requirements can also be spelled out through a custom dependency > >> > + * checker. In order to account for the dual entry point nature of x86-64 Linux > >> > + * for "paravirtualization yielding" and to make annotations for support for > >> > + * these explicit each struct x86_init_fn must specify supported > >> > + * subarchitectures. The earliest x86-64 code can read the subarchitecture > >> > + * though is after load_idt(), as such the earliest we can currently rely on > >> > + * subarchitecture for semantics and a common init sequences is on the shared > >> > + * common x86_64_start_reservations(). Each struct x86_init_fn must also > >> > + * declare a two-digit decimal number to impose an ordering relative to other > >> > + * features when required. > >> > >> I'm totally lost in the paragraph above. > > > > As an example Kasan could be defined as a struct x86_init_fn with a respective > > callback on x86_64_start_kernel() (kasan_early_init()) and later setup_arch() > > (kasan_init()). Since we can't support for struct x86_init_fn calls to start all > > the way from x86_64_start_kernel() (given the issues I had noted before) > > the earliest a struct x86_init_fn can be used for is for a callback on > > x86_64_start_reservations(). > > I'm still lost. Kasan is a pile of code that does something. Kasan > has some initialization code. Do you mean that kasan's initialization > code could be encapsulated in one or more x86_init_fn instances? Yup. Kasan has for instance: arch/x86/kernel/head64.c: asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data) { ... kasan_early_init() ... } Then on arch/x86/kernel/setup.c: void __init setup_arch(char **cmdline_p) { ... kasan_init(); ... } With x86_init_fn we'd have something like within arch/x86/mm/kasan_init_64.c: static LINKTABLE_INIT_DATA(x86_init_fns, X86_INIT_ORDER_EARLY) __x86_init_fn_kasan_early_init = { .order_level = X86_INIT_ORDER_EARLY, .supp_hardware_subarch = BIT(X86_SUBARCH_PC), .early_init = kasan_early_init, .setup_arch = kasan_init, }; Only we'd wrap this in a macro wrapper to make it look something like: x86_init_early_pc_setup(kasan_early_init, kasan_init); With that in place we can remove the explicit kasan_init() call on setup.c and kasan_early_init() in head64.c as x86_64_start_kernel() would have a single x86_init_fn_early_init(); and setup_arch() would have something like x86_init_fn_setup_arch() -- each of those would iterate over the x86_init_fns linker table, and call their respective callback. Please note though that since subarch is currently only readable until after load_idt() it means though that the we can't fold kasan into this framework yet, its why I placed x86_init_fn_early_init() until x86_64_start_reservations(). The proactive measure here would be that if a developer were to add a feature similar in nature to Kasan they'd have to think about what subarch they support and vet / ensure they work, and if they can't get it to work at least they'd need to design it in a way that enables it to be disabled at run time. Right now Kasan cannot be disabled at run time. > > It just means we can different callbacks used for different parts of the > > x86 init process, it explains the limits to the callbacks we can have right now. > > Do you mean that there would be different linker tables containing > x86_init_fn instances for each of these callback points? Or is > "early_init" the only such callback point. No we'd have only one linker table: x86_init_fns, and in it would be structs of type struct x86_init_fn. We can iterate over the linker the table anywhere, in our case since we're using it to help with the init process, and since x86 features can have calls in different parts of the x86 init process we can just iterate over the linker table during these difference key places and have it call the needed callback for each place. So x86_init_fn_early_init() would call early_init(). Its the only callback we have right now. Later, we may add a setup_arch() callback and then place a x86_init_fn_setup_arch() on setup_arch() to remove clutter there. > BTW, why is early_init called early_init? Why not call it "callback"? > Then you could reuse this for other things. Because its trying to make it clear where in the x86 init path its going to be called, in this case today the earliest I can get subarch to be read is x86_64_start_reservations(). > >> > + * @detect: optional, if set returns true if the feature has been detected to > >> > + * be required, it returns false if the feature has been detected to not > >> > + * be required. > >> > >> I have absolutely no idea what this means. > > > > Well, so the order level stuff is used at link time, there however are some > > run time environment conditions that might be important to consider as well, > > the detect lets you write a C code routine which will check to see if the > > feature should run, it returns false if the feature should not run or is > > not required to run. > > So how is: > > static bool detect_foo(void) { return whatever; } > statis void init_foo(void) { do something; } > > .detect = detect_foo, > .early_init = init_foo, > > different from: > > static void init_foo(void) > { > if (whatever) do something; > } > > .early_init = init_foo, Its not, but the key here is not @detect but rather the pair of both @detect and @depend, the @depend would refer to a @detect routine from another "feature". We then just run time sort things to kick off in the right order -- this would help for cases the linker sort would not suffice (things which can only be determined at run time). The run time sort was added for IOMMU originally to handle a complex dependency chain which could only be determined at run time. Instead of sprinkling checks all over we have the dependencies clearly annotated with this simple structure and have a run time sort, sort the init sequences as they need to happen. Its important to not only ensure order but also to ensure things that *should not run* don't run. Refer to the implementation in arch/x86/kernel/sort-init.c which is heavily borrowed from the existing IOMMU run time sort we do to handle the run time dependencies there. The reason we can't share sort between x86 and IOMMU is the structs vary slightly, so the structs are of different size and the semantics may end up being very different. Lastly if a bug in sort / semantics in IOMMU we could end up regressing x86 init as well. We can't possibly assume we can share the same semantics across the board for all users of linker tables. In this case its best to just keep these linker tables separately and enable their own semantics to be defined independently. > > A cheesy way to implement a cheesy graph is to then use the same callback > > as a pointer for dependency, so that if a feature depends on another, both > > are on the same order level, and if they have a run time dependency > > annotation, they can further annotate this relationship by having the > > thing that depends, set its depends pointer to refer to the other > > feature's detect callback. > > > > This is borrowed from the IOMMU init code in the kernel. Same logic. > > Only thing is we also get link order support too. Link order semantics > > and logic should suffice for most things, but not all as with the IOMMU > > init stuff. > > But this doesn't work at all if .detect isn't set. I've changed the semantics for x86. So for x86 this is optional, if detect is null we ignore it. Likewise if depend is null, we ignore it for sorting purposes. > >> And what happens if the depend-implied order is > >> inconsistent with order_level. > > > > Ah -- that's what I was trying to explain above in the comment > > about inconsistent state -- a run time check *after* we do run > > time sorting goes through and checks for sanity. > > > > All this is a light weight feature graph, if you will, other > > folks are working on different solutions for this and while > > I have considered sharing a solution for early init code this > > seems a bit more ideal for now. Later in the future though we > > should probably get together and talk about a proper feature > > graph, with more consistent checks, etc. Its another reason > > why I am also curious to see a SAT solver at run time in the > > future, but this is super long term [0]... > > > > [0] http://kernelnewbies.org/KernelProjects/kconfig-sat > > > >> I would hope that order_level breaks ties in the topological sort and > >> that there's a bit fat warning if the order_level ordering is > >> inconsistent with the topological ordering. Preferably the warning > >> would be at build time, in which case it could be an error. > > > > Order level will always be respected at link time. The depend > > and detect thing are heuristics used at run time and since > > it depends on run time state we can't do this at build time. > > The big fat hairy nasty warning is there though. Curious > > enough I ended up finding out that the sanity check for > > IOMMU was intended to only be run at debug time but the > > debug thing was *always* set to true, so we've been running > > the sanity checker in the kernel for ages. Its why I also > > feel a bit more comfortable in keeping it live and matching > > more close to its implementation. I've gone to pains to > > build off of the IOMMU init solution and simply extend > > on the semantics there. Turns out there was much room for > > enhancements. All the fine tuning are documented in the > > userspace tree. > > So is this what you're saying: > > Callbacks are called in an order consistent with the dependencies > specified by @depend. No, @depend just helps with run time sorting the linker table so things that need to go first which require run time semantics get tucked in first. This should help with cruft checks, waiting on dependencies, so it should speed up the process as well. For this x86 init stuff there will be one callback per each major x86 init path, we'll start off with x86_init_fn_early_init() which is the earliest we can get the subarch. > Ties are broken first by @order_level and then > by linker order. Right. > It is an error for an x86_init_fn with lower > @order_level to depend on an x86_init_fn with higher @order_level. Right. Another way to say this: If p @depends on q, q's order level must be <= than p's as it should run first. Note that technically if we didn't want this check we could remove it, I tested that it would force the run time sort to then override the order level, but that make semantics sloppy, so I've pegged this as a requirement in definition and built in a checker for it. This is in x86_init_fn_check(). > >> > + * @early_init: required, routine which will run in x86_64_start_reservations() > >> > + * after we ensure boot_params.hdr.hardware_subarch is accessible and > >> > + * properly set. Memory is not yet available. This the earliest we can > >> > + * currently define a common shared callback since all callbacks need to > >> > + * check for boot_params.hdr.hardware_subarch and this becomes accessible > >> > + * on x86-64 until after load_idt(). > >> > >> What's this for? Under what conditions is it called? What order? > >> Why is it part of x86_init_fn at all? > > > > Its one of the reasons we are adding this, callbacks to for features. > > Its called if the its requirements are met (subarch, and depends). > > I'll work on extending this to be clearer. > > I think this should just be ".callback". Let the context by implied > by the linker table in which it appears. The thing is we want to be specific and allow one struct per main x86 init path major call site, this lets you have one struct for Kasan for instance, and you have semantics well spelled out. > > > >> > + */ > >> > +struct x86_init_fn { > >> > + __u32 order_level; > >> > + __u32 supp_hardware_subarch; > >> > + bool (*detect)(void); > >> > + bool (*depend)(void); > >> > + void (*early_init)(void); > >> > + __u32 flags; > >> > +}; > >> > + > >> > +/** > >> > + * enum x86_init_fn_flags: flags for init sequences > >> > + * > >> > + * X86_INIT_FINISH_IF_DETECTED: tells the core that once this init sequence > >> > + * has completed it can break out of the loop for init sequences on > >> > + * its own level. > >> > >> What does this mean? > > > > It borrows the logic from the IOMMU stuff, it enables the core to detect > > if *any* routine in the linker table has completed we can then now just > > bail and stop going through the rest of the linker table. > > Can you give an example for which this would be used? I'm going to rip this out. I know this will be useful but since I can't see anything obvious right now that simple link order won't address on early boot I'm going to rip this out in the next series. If we find we may need it later we can consider it later. Such a sort run time sort thing might prove more useful later in boot where asynchronous behaviour is more prevalent. Folks working in higher layers may find more use for it. Luis
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug index 9b18ed97a8a2..af5582d33dd8 100644 --- a/arch/x86/Kconfig.debug +++ b/arch/x86/Kconfig.debug @@ -383,4 +383,51 @@ config PUNIT_ATOM_DEBUG The current power state can be read from /sys/kernel/debug/punit_atom/dev_power_state +config X86_DEBUG_LINKER_TABLES + bool "x86 linker table debug" + depends on DEBUG_KERNEL + default n + ---help--- + Enabling this should enable debugging linker tables on x86 + with its own solution. To help debug it enables the "x86 beta" + and "x86 gamma" built-in feature. Both "x86 alpha" and "x86 delta" + built-in features are left disabled. You should see all tables + compiled, except delta, only the beta and gamma feature will be + linked in. The delta feature will only be compiled and linked in + if and only if its enabled the old way. + + For more details on these linker table refer to: + + include/linux/tables.h + + If unsure, say N. + +config X86_DEBUG_LINKER_TABLE_ALPHA + bool "x86 linker table alpha" + default n + depends on X86_DEBUG_LINKER_TABLES + ---help--- + Enabling this should enable the linker table "x86 alpha" feature. + +config X86_DEBUG_LINKER_TABLE_BETA + bool "x86 linker table beta" + default y + depends on X86_DEBUG_LINKER_TABLES + ---help--- + Enabling this should enable the linker table "x86 beta" feature. + +config X86_DEBUG_LINKER_TABLE_GAMMA + bool "x86 linker table gamma" + default y + depends on X86_DEBUG_LINKER_TABLES + ---help--- + Enabling this should enable the linker table "x86 gamma" feature. + +config X86_DEBUG_LINKER_TABLE_DELTA + bool "x86 linker table delta" + default n + depends on X86_DEBUG_LINKER_TABLES + ---help--- + Enabling this should enable the linker table "x86 delta" feature. + endmenu diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h index 1ae89a2721d6..0df68c814147 100644 --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -2,6 +2,7 @@ #define _ASM_X86_PLATFORM_H #include <asm/bootparam.h> +#include <asm/x86_init_fn.h> struct mpc_bus; struct mpc_cpu; diff --git a/arch/x86/include/asm/x86_init_fn.h b/arch/x86/include/asm/x86_init_fn.h new file mode 100644 index 000000000000..1521a546c734 --- /dev/null +++ b/arch/x86/include/asm/x86_init_fn.h @@ -0,0 +1,263 @@ +#ifndef __X86_INIT_TABLES_H +#define __X86_INIT_TABLES_H + +#include <linux/types.h> +#include <linux/tables.h> +#include <linux/init.h> +#include <linux/bitops.h> + +/** + * struct x86_init_fn - x86 generic kernel init call + * + * Linux x86 features vary in complexity, features may require work done at + * different levels of the full x86 init sequence. Today there are also two + * different possible entry points for Linux on x86, one for bare metal, KVM + * and Xen HVM, and another for Xen PV guests / dom0. Assuming a bootloader + * has set up 64-bit mode, roughly the x86 init sequence follows this path: + * + * Bare metal, KVM, Xen HVM Xen PV / dom0 + * startup_64() startup_xen() + * \ / + * x86_64_start_kernel() xen_start_kernel() + * \ / + * x86_64_start_reservations() + * | + * start_kernel() + * [ ... ] + * [ setup_arch() ] + * [ ... ] + * init + * + * x86_64_start_kernel() and xen_start_kernel() are the respective first C code + * entry starting points. The different entry points exist to enable Xen to + * skip a lot of hardware setup already done and managed on behalf of the + * hypervisor, we refer to this as "paravirtualization yielding". The different + * levels of init calls on the x86 init sequence exist to account for these + * slight differences and requirements. These different entry points also share + * a common entry x86 specific path, x86_64_start_reservations(). + * + * A generic x86 feature can have different initialization calls, one on each + * of the different main x86 init sequences, but must also address both entry + * points in order to work properly across the board on all supported x86 + * subarchitectures. Since x86 features can also have dependencies on other + * setup code or features, x86 features can at times be subordinate to other + * x86 features, or conditions. struct x86_init_fn enables feature developers + * to annotate dependency relationships to ensure subsequent init calls only + * run once a subordinate's dependencies have run. When needed custom + * dependency requirements can also be spelled out through a custom dependency + * checker. In order to account for the dual entry point nature of x86-64 Linux + * for "paravirtualization yielding" and to make annotations for support for + * these explicit each struct x86_init_fn must specify supported + * subarchitectures. The earliest x86-64 code can read the subarchitecture + * though is after load_idt(), as such the earliest we can currently rely on + * subarchitecture for semantics and a common init sequences is on the shared + * common x86_64_start_reservations(). Each struct x86_init_fn must also + * declare a two-digit decimal number to impose an ordering relative to other + * features when required. + * + * x86_init_fn enables strong semantics and dependencies to be defined and + * implemented on the full x86 initialization sequence. + * + * @order_level: must be set, linker order level, this corresponds to the table + * section sub-table index, we record this only for semantic validation + * purposes. Order-level is always required however you typically would + * only use X86_INIT_NORMAL*() and leave ordering to be done by placement + * of code in a C file and the order of objects through a Makefile. Custom + * order-levels can be used when order on C file and order of objects on + * Makfiles does not suffice or much further refinements are needed. + * @supp_hardware_subarch: must be set, it represents the bitmask of supported + * subarchitectures. We require each struct x86_init_fn to have this set + * to require developer considerations for each supported x86 + * subarchitecture and to build strong annotations of different possible + * run time states particularly in consideration for the two main + * different entry points for x86 Linux, to account for paravirtualization + * yielding. + * + * The subarchitecture is read by the kernel at early boot from the + * struct boot_params hardware_subarch. Support for the subarchitecture + * exists as of x86 boot protocol 2.07. The bootloader would have set up + * the respective hardware_subarch on the boot sector as per + * Documentation/x86/boot.txt. + * + * What x86 entry point is used is determined at run time by the + * bootloader. Linux pv_ops was designed to help enable to build one Linux + * binary to support bare metal and different hypervisors. pv_ops setup + * code however is limited in that all pv_ops setup code is run late in + * the x86 init sequence, during setup_arch(). In fact cpu_has_hypervisor + * only works after early_cpu_init() during setup_arch(). If an x86 + * feature requires an earlier determination of what hypervisor was used, + * or if it needs to annotate only support for certain hypervisors, the + * x86 hardware_subarch should be set by the bootloader and + * @supp_hardware_subarch set by the x86 feature. Using hardware_subarch + * enables x86 features to fill the semantic gap between the Linux x86 + * entry point used and what pv_ops has to offer through a hypervisor + * agnostic mechanism. + * + * Each supported subarchitecture is set using the respective + * X86_SUBARCH_* as a bit in the bitmask. For instance if a feature + * is supported on PC and Xen subarchitectures only you would set this + * bitmask to: + * + * BIT(X86_SUBARCH_PC) | + * BIT(X86_SUBARCH_XEN); + * + * @detect: optional, if set returns true if the feature has been detected to + * be required, it returns false if the feature has been detected to not + * be required. + * @depend: optional, if set this set of init routines must be called prior to + * the init routine who's respective detect routine we have set this + * depends callback to. This is only used for sorting purposes given + * all current init callbacks have a void return type. Sorting is + * implemented via x86_init_fn_sort(), it must be called only once, + * however you can delay sorting until you need it if you can ensure + * only @order_level and @supp_hardware_subarch can account for proper + * ordering and dependency requirements for all init sequences prior. + * If you do not have a depend callback set its assumed the order level + * (__x86_init_fn(level)) set by the init routine suffices to set the + * order for when the feature's respective callbacks are called with + * respect to other calls. Sorting of init calls with the same order level + * is determined by linker order, determined by order placement on C code + * and order listed on a Makefile. A routine that depends on another is + * known as being subordinate to the init routine it depends on. Routines + * that are subordinate must have an order-level of lower priority or + * equal priority than the order-level of the init sequence it depends on. + * @early_init: required, routine which will run in x86_64_start_reservations() + * after we ensure boot_params.hdr.hardware_subarch is accessible and + * properly set. Memory is not yet available. This the earliest we can + * currently define a common shared callback since all callbacks need to + * check for boot_params.hdr.hardware_subarch and this becomes accessible + * on x86-64 until after load_idt(). + * @flags: optional, bitmask of enum x86_init_fn_flags + */ +struct x86_init_fn { + __u32 order_level; + __u32 supp_hardware_subarch; + bool (*detect)(void); + bool (*depend)(void); + void (*early_init)(void); + __u32 flags; +}; + +/** + * enum x86_init_fn_flags: flags for init sequences + * + * X86_INIT_FINISH_IF_DETECTED: tells the core that once this init sequence + * has completed it can break out of the loop for init sequences on + * its own level. + * X86_INIT_DETECTED: private flag. Used by the x86 core to annotate that this + * init sequence has been detected and it all of its callbacks + * must be run during initialization. + */ +enum x86_init_fn_flags { + X86_INIT_FINISH_IF_DETECTED = BIT(0), + X86_INIT_DETECTED = BIT(1), +}; + +DECLARE_LINKTABLE_INIT(struct x86_init_fn, x86_init_fns); + +/* Init order levels, we can start at 0000 but reserve 0000-0999 for now */ +#define X86_INIT_ORDER_EARLY 1000 +#define X86_INIT_ORDER_NORMAL 3000 +#define X86_INIT_ORDER_LATE 5000 + +/* + * Use LTO_REFERENCE_INITCALL just in case of issues with old versions of gcc. + * This might not be needed for linker tables due to how we compartamentalize + * sections and then order them at linker time, but just in case. + */ + +#define x86_init(__level, \ + __supp_hardware_subarch, \ + __detect, \ + __depend, \ + __early_init) \ + static LINKTABLE_INIT(x86_init_fns, __level) \ + __x86_init_fn_##__early_init2 = { \ + .order_level = __level, \ + .supp_hardware_subarch = __supp_hardware_subarch, \ + .detect = __detect, \ + .depend = __depend, \ + .early_init = __early_init, \ + }; \ + LTO_REFERENCE_INITCALL(__x86_init_fn_##__early_init); + +#define x86_init_early(__supp_hardware_subarch, \ + __detect, \ + __depend, \ + __early_init) \ + x86_init(X86_INIT_ORDER_EARLY, __supp_hardware_subarch, \ + __detect, __depend, \ + __early_init); + +#define x86_init_normal(__supp_hardware_subarch, \ + __detect, \ + __depend, \ + __early_init) \ + x86_init(__name, X86_INIT_ORDER_NORMAL, __supp_hardware_subarch,\ + __detect, __depend, \ + __early_init); + +#define x86_init_early_all(__detect, \ + __depend, \ + __early_init) \ + x86_init_early(__name, X86_SUBARCH_ALL_SUBARCHS, \ + __detect, __depend, \ + __early_init); + +#define x86_init_early_pc(__detect, \ + __depend, \ + __early_init) \ + x86_init_early(BIT(X86_SUBARCH_PC), \ + __detect, __depend, \ + __early_init); + +#define x86_init_early_pc_simple(__early_init) \ + x86_init_early((BIT(X86_SUBARCH_PC)), NULL, NULL, \ + __early_init); + +#define x86_init_normal_all(__detect, \ + __depend, \ + __early_init) \ + x86_init_normal(X86_SUBARCH_ALL_SUBARCHS, \ + __detect, __depend, \ + __early_init); + +#define x86_init_normal_pc(__detect, \ + __depend, \ + __early_init) \ + x86_init_normal((BIT(X86_SUBARCH_PC)), \ + __detect, __depend, \ + __early_init); + + +#define x86_init_normal_xen(__detect, \ + __depend, \ + __early_init) \ + x86_init_normal((BIT(X86_SUBARCH_XEN)), \ + __detect, __depend, \ + __early_init); + +/** + * x86_init_fn_early_init: call all early_init() callbacks + * + * This calls all early_init() callbacks on the x86_init_fns linker table. + */ +void x86_init_fn_early_init(void); + +/** + * x86_init_fn_init_tables - sort and check x86 linker table + * + * This sorts struct x86_init_fn init sequences in the x86_init_fns linker + * table by ensuring that init sequences that depend on other init sequences + * are placed later in the linker table. Init sequences that do not have + * dependencies are left in place. Circular dependencies are not allowed. + * The order-level of subordinate init sequences, that is of init sequences + * that depend on other init sequences, must have an order-level of lower + * or equal priority to the init sequence it depends on. + * + * This also validates semantics of all struct x86_init_fn init sequences + * on the x86_init_fns linker table. + */ +void x86_init_fn_init_tables(void); + +#endif /* __X86_INIT_TABLES_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index b1b78ffe01d0..be167a0a5e2c 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -23,6 +23,8 @@ KASAN_SANITIZE_dumpstack_$(BITS).o := n CFLAGS_irq.o := -I$(src)/../include/asm/trace obj-y := process_$(BITS).o signal.o +obj-y += init.o sort-init.o +obj-$(CONFIG_X86_DEBUG_LINKER_TABLES) += dbg-tables/ obj-$(CONFIG_COMPAT) += signal_compat.o obj-y += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o obj-y += time.o ioport.o dumpstack.o nmi.o diff --git a/arch/x86/kernel/dbg-tables/Makefile b/arch/x86/kernel/dbg-tables/Makefile new file mode 100644 index 000000000000..02d12c502ad0 --- /dev/null +++ b/arch/x86/kernel/dbg-tables/Makefile @@ -0,0 +1,18 @@ +# You should see all these compile but at run time you'd +# only see the ones that were linked in. + +# This ensures we always compile but only link +# in alpha if it was enabled. This is typically what +# you are expected to use to avoid code bit-rot. +table-$(CONFIG_X86_DEBUG_LINKER_TABLE_ALPHA) += alpha.o + +# This accomplishes the same, but requires 2 lines. +extra-y += beta.o +obj-$(CONFIG_X86_DEBUG_LINKER_TABLE_BETA) += beta.o + +table-$(CONFIG_X86_DEBUG_LINKER_TABLE_GAMMA) += gamma.o + +# If you *know* you only want to enable compilation of +# a feature when its selected you can just use the good +# ol' obj +obj-$(CONFIG_X86_DEBUG_LINKER_TABLE_DELTA) += delta.o diff --git a/arch/x86/kernel/dbg-tables/alpha.c b/arch/x86/kernel/dbg-tables/alpha.c new file mode 100644 index 000000000000..54754f893a08 --- /dev/null +++ b/arch/x86/kernel/dbg-tables/alpha.c @@ -0,0 +1,10 @@ +#define pr_fmt(fmt) "debug-alpha: " fmt + +#include <linux/kernel.h> +#include <asm/x86_init.h> + +static void early_init_dbg_alpha(void) { + pr_info("early_init triggered\n"); +} + +x86_init_early_pc_simple(early_init_dbg_alpha); diff --git a/arch/x86/kernel/dbg-tables/beta.c b/arch/x86/kernel/dbg-tables/beta.c new file mode 100644 index 000000000000..7384a57fc386 --- /dev/null +++ b/arch/x86/kernel/dbg-tables/beta.c @@ -0,0 +1,18 @@ +#define pr_fmt(fmt) "debug-beta: " fmt + +#include <linux/kernel.h> +#include <asm/x86_init.h> + +#include "gamma.h" + +static bool x86_dbg_detect_beta(void) +{ + return true; +} + +static void early_init_dbg_beta(void) { + pr_info("early_init triggered\n"); +} +x86_init_early_pc(x86_dbg_detect_beta, + x86_dbg_detect_gamma, + early_init_dbg_beta); diff --git a/arch/x86/kernel/dbg-tables/delta.c b/arch/x86/kernel/dbg-tables/delta.c new file mode 100644 index 000000000000..9d38c68e602a --- /dev/null +++ b/arch/x86/kernel/dbg-tables/delta.c @@ -0,0 +1,10 @@ +#define pr_fmt(fmt) "debug-delta: " fmt + +#include <linux/kernel.h> +#include <asm/x86_init.h> + +static void early_init_dbg_delta(void) { + pr_info("early_init triggered\n"); +} + +x86_init_early_pc_simple(early_init_dbg_delta); diff --git a/arch/x86/kernel/dbg-tables/gamma.c b/arch/x86/kernel/dbg-tables/gamma.c new file mode 100644 index 000000000000..7b663c1f08f4 --- /dev/null +++ b/arch/x86/kernel/dbg-tables/gamma.c @@ -0,0 +1,18 @@ +#define pr_fmt(fmt) "debug-gamma: " fmt + +#include <linux/kernel.h> +#include <asm/x86_init.h> + +bool x86_dbg_detect_gamma(void) +{ + return true; +} + +static void early_init_dbg_gamma(void) +{ + pr_info("early_init triggered\n"); +} + +x86_init_early_pc(x86_dbg_detect_gamma, + NULL, + early_init_dbg_gamma); diff --git a/arch/x86/kernel/dbg-tables/gamma.h b/arch/x86/kernel/dbg-tables/gamma.h new file mode 100644 index 000000000000..810d450ddd14 --- /dev/null +++ b/arch/x86/kernel/dbg-tables/gamma.h @@ -0,0 +1,3 @@ +#include <asm/x86_init.h> + +bool x86_dbg_detect_gamma(void); diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c index 2911ef3a9f1c..d93f3e42e61b 100644 --- a/arch/x86/kernel/head32.c +++ b/arch/x86/kernel/head32.c @@ -19,6 +19,7 @@ #include <asm/bios_ebda.h> #include <asm/tlbflush.h> #include <asm/bootparam_utils.h> +#include <asm/x86_init.h> static void __init i386_default_early_setup(void) { @@ -47,5 +48,8 @@ asmlinkage __visible void __init i386_start_kernel(void) break; } + x86_init_fn_init_tables(); + x86_init_fn_early_init(); + start_kernel(); } diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 2c0f3407bd1f..a3b56aecaeda 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -28,6 +28,7 @@ #include <asm/bootparam_utils.h> #include <asm/microcode.h> #include <asm/kasan.h> +#include <asm/x86_init.h> /* * Manage page tables very early on. @@ -190,6 +191,9 @@ void __init x86_64_start_reservations(char *real_mode_data) if (!boot_params.hdr.version) copy_bootdata(__va(real_mode_data)); + x86_init_fn_init_tables(); + x86_init_fn_early_init(); + reserve_ebda_region(); switch (boot_params.hdr.hardware_subarch) { diff --git a/arch/x86/kernel/init.c b/arch/x86/kernel/init.c new file mode 100644 index 000000000000..b012c59f11dc --- /dev/null +++ b/arch/x86/kernel/init.c @@ -0,0 +1,55 @@ +#define pr_fmt(fmt) "x86-init: " fmt + +#include <linux/bug.h> +#include <linux/kernel.h> + +#include <asm/x86_init_fn.h> +#include <asm/bootparam.h> +#include <asm/boot.h> +#include <asm/setup.h> + +DEFINE_LINKTABLE_INIT(struct x86_init_fn, x86_init_fns); + +static bool x86_init_fn_supports_subarch(struct x86_init_fn *fn) +{ + if (!fn->supp_hardware_subarch) { + pr_err("Init sequence fails to declares any supported subarchs: %pF\n", fn->early_init); + WARN_ON(1); + } + if (BIT(boot_params.hdr.hardware_subarch) & fn->supp_hardware_subarch) + return true; + return false; +} + +void __ref x86_init_fn_early_init(void) +{ + int ret; + struct x86_init_fn *init_fn; + unsigned int num_inits = LINKTABLE_SIZE(x86_init_fns); + + if (!num_inits) + return; + + pr_debug("Number of init entries: %d\n", num_inits); + + LINKTABLE_FOR_EACH(init_fn, x86_init_fns) { + if (!x86_init_fn_supports_subarch(init_fn)) + continue; + if (!init_fn->detect) + init_fn->flags |= X86_INIT_DETECTED; + else { + ret = init_fn->detect(); + if (ret > 0) + init_fn->flags |= X86_INIT_DETECTED; + } + + if (init_fn->flags & X86_INIT_DETECTED) { + init_fn->flags |= X86_INIT_DETECTED; + pr_debug("Running early init %pF ...\n", init_fn->early_init); + init_fn->early_init(); + pr_debug("Completed early init %pF\n", init_fn->early_init); + if (init_fn->flags & X86_INIT_FINISH_IF_DETECTED) + break; + } + } +} diff --git a/arch/x86/kernel/sort-init.c b/arch/x86/kernel/sort-init.c new file mode 100644 index 000000000000..c03669f6f9d6 --- /dev/null +++ b/arch/x86/kernel/sort-init.c @@ -0,0 +1,114 @@ +#include <linux/types.h> +#include <linux/kernel.h> +#include <linux/string.h> +#include <asm/x86_init_fn.h> + +static struct x86_init_fn *x86_init_fn_find_dep(struct x86_init_fn *start, + struct x86_init_fn *finish, + struct x86_init_fn *q) +{ + struct x86_init_fn *p; + + if (!q) + return NULL; + + for (p = start; p < finish; p++) + if (p->detect == q->depend) + return p; + + return NULL; +} + +static void x86_init_fn_sort(struct x86_init_fn *start, + struct x86_init_fn *finish) +{ + + struct x86_init_fn *p, *q, tmp; + + for (p = start; p < finish; p++) { +again: + q = x86_init_fn_find_dep(start, finish, p); + /* + * We are bit sneaky here. We use the memory address to figure + * out if the node we depend on is past our point, if so, swap. + */ + if (q > p) { + tmp = *p; + memmove(p, q, sizeof(*p)); + *q = tmp; + goto again; + } + } + +} + +static void x86_init_fn_check(struct x86_init_fn *start, + struct x86_init_fn *finish) +{ + struct x86_init_fn *p, *q, *x; + + /* Simple cyclic dependency checker. */ + for (p = start; p < finish; p++) { + if (!p->depend) + continue; + q = x86_init_fn_find_dep(start, finish, p); + x = x86_init_fn_find_dep(start, finish, q); + if (p == x) { + pr_info("CYCLIC DEPENDENCY FOUND! %pF depends on %pF and vice-versa. BREAKING IT.\n", + p->early_init, q->early_init); + /* Heavy handed way..*/ + x->depend = 0; + } + } + + /* + * Validate sorting semantics. + * + * p depends on q so: + * - q must run first, so q < p. If q > p that's an issue + * as its saying p must run prior to q. We already sorted + * this table, this is a problem. + * + * - q's order level must be <= than p's as it should run first + */ + for (p = start; p < finish; p++) { + if (!p->depend) + continue; + /* + * Be pedantic and do a full search on the entire table, + * if we need further validation, after this is called + * one could use an optimized version which just searches + * on x86_init_fn_find_dep(p, finish, p), as we would have + * guarantee on proper ordering both at the dependency level + * and by order level. + */ + q = x86_init_fn_find_dep(start, finish, p); + if (q && q > p) { + pr_info("EXECUTION ORDER INVALID! %pF should be called before %pF!\n", + p->early_init, q->early_init); + } + + /* + * Technically this would still work as the memmove() would + * have forced the dependency to run first, however we want + * strong semantics, so lets avoid these. + */ + if (q && q->order_level > p->order_level) { + pr_info("INVALID ORDER LEVEL! %pF should have an order level <= be called before %pF!\n", + p->early_init, q->early_init); + } + } +} + +void __ref x86_init_fn_init_tables(void) +{ + unsigned int num_inits = LINKTABLE_SIZE(x86_init_fns); + + if (!num_inits) + return; + + x86_init_fn_sort(LINKTABLE_START(x86_init_fns), + LINKTABLE_END(x86_init_fns)); + x86_init_fn_check(LINKTABLE_START(x86_init_fns), + LINKTABLE_END(x86_init_fns)); +}
Any failure on the x86 init path can be catastrophic. A simple shift of a call from one place to another can easily break things. Likewise adding a new call to one path without considering all x86 requirements can make certain x86 run time environments crash. We currently account for these requirements through peer code review and run time testing. We could do much better if we had a clean and simple way to annotate strong semantics for run time requirements, init sequence dependencies, and detection mechanisms for additions of new x86 init sequences. This adds support for linker tables for x86 in order to help define strong semantics for x86 init sequences and as collateral help simplify the x86 init sequence. The defined struct x86_init_fn is inspired by both iPXE's sample definitions but is also also heavily inspired by Linux's own IOMMU initialiation solution which enables to extend initialization semantics to support init routine custom detection routines, and enables adding annotations for init routine dependency mapping. A newly featured solution in this design is to build up on the hardware subarchitecture added to the x86 boot protocol 2.07 in 2007 by Rusty but never really taken advantage of except for lguest, using it as a stop-gap for new x86 features which have failed to take into consideration the dual x86-64 entry points possible due to paravirtualization yielding requirements. The hardware subarchitecture could also potentially be used in the future to help unify Linux x86 entry points. The current disjoint entry points for x86-64 can be summarized as follows. Bare metal, KVM, Xen HVM Xen PV / dom0 startup_64() startup_xen() \ / x86_64_start_kernel() xen_start_kernel() \ / x86_64_start_reservations() | start_kernel() [ ... ] [ setup_arch() ] [ ... ] init Inherited by the nature of using linker tables we also gain the ability to sort init sequences using the linker through the use of order-level and linker SORT() and only optionally enable use specific init sequence sorting when init seqeuence dependencies semantics are needed. By using linker tables we can also avoid #ifdefer'y on select code when and if desirable, this is also completley optional but it is a feature we inherit by using linker tables. For *new x86 features* this enables strong semantics to be considered. For *existing x86 feature code* this enables the opportunity to clarify requirements and dependencies, which ultimately also provides a new way to ensure that code that should not run never runs, that is, it provides one mechanism to help prevent dead code. For a more elaborate description on what dead code is exactly and how it can relate to init sequences in particular to Xen due to its init sequence refer to [0] and [1]. Four debug x86 features are included to help test x86 linker table support. You'll want CONFIG_X86_DEBUG_LINKER_TABLES to enable those. The features defined with table-$(CONFIG-x) will *always* be compiled but only linked in when enabled. The features defined with the good 'ol obj-y will only be compiled as we're used to. If CONFIG_X86_DEBUG_LINKER_TABLES is disabled nothing currently runs as there are no init sequences yet using any of this. The goal is to evolve the semantics carefully as needed. We start with one basic callback, an early_init() called at the beginning of x86_64_start_reservations(). v2: - port to new linker tables as discussed - since the start and end of a table are now the empty string and "~" string, we can use any digit range for our order levels. Because of this bump the order level ranges for x86 from 01.99 to 0000-9999. We can arbitrarilly change this later, but this also gives us a lot of leg room for easy adjustments later. - since we are now using the basic .init section we have coverge support for modpost to check section mismatch issues for us! This also means we had to peg __ref on a few of our callers which used .init section code. - Since we're using the standard .init section we can drop now our custom modification of both arch/x86/tools/relocs.c and arch/x86/kernel/vmlinux.lds.S ! To be clear we don't need any further linker script hacks now when making use of linker tables. [0] http://www.do-not-panic.com/2015/12/avoiding-dead-code-pvops-not-silver-bullet.html [1] http://www.do-not-panic.com/2015/12/xen-and-x86-linux-zero-page.html Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> --- arch/x86/Kconfig.debug | 47 +++++++ arch/x86/include/asm/x86_init.h | 1 + arch/x86/include/asm/x86_init_fn.h | 263 ++++++++++++++++++++++++++++++++++++ arch/x86/kernel/Makefile | 2 + arch/x86/kernel/dbg-tables/Makefile | 18 +++ arch/x86/kernel/dbg-tables/alpha.c | 10 ++ arch/x86/kernel/dbg-tables/beta.c | 18 +++ arch/x86/kernel/dbg-tables/delta.c | 10 ++ arch/x86/kernel/dbg-tables/gamma.c | 18 +++ arch/x86/kernel/dbg-tables/gamma.h | 3 + arch/x86/kernel/head32.c | 4 + arch/x86/kernel/head64.c | 4 + arch/x86/kernel/init.c | 55 ++++++++ arch/x86/kernel/sort-init.c | 114 ++++++++++++++++ 14 files changed, 567 insertions(+) create mode 100644 arch/x86/include/asm/x86_init_fn.h create mode 100644 arch/x86/kernel/dbg-tables/Makefile create mode 100644 arch/x86/kernel/dbg-tables/alpha.c create mode 100644 arch/x86/kernel/dbg-tables/beta.c create mode 100644 arch/x86/kernel/dbg-tables/delta.c create mode 100644 arch/x86/kernel/dbg-tables/gamma.c create mode 100644 arch/x86/kernel/dbg-tables/gamma.h create mode 100644 arch/x86/kernel/init.c create mode 100644 arch/x86/kernel/sort-init.c