|
|
CPU Module Interface
====================
Interface:
Introduction
------------
The CPU module interface allows the kernel and certain kernel modules
to perform various operations on the physical cpus of a system without
regard for the particular cpu vendor/family/model/stepping. The
present API members are concerned with machine check and fault
management, but there is no reason that the API cannot be exanded
for other functionality.
XXFM Needs expansion and clarification.
Currently "the cpus" is defined to be "those things for which
a cpu_t is allocated". This amounts to every core, or every
virtual cpu on a hyperthreaded system (which is unsatisfactory
since hardware threads share many core resources). This definition,
and the CPU module API, need to mature for the possible presence
of a hypervisor which may be presenting a set of virtual cpus to us
and in which a cpu_t does not correspond to any one fixed physical
execution resource.
In its current form the CPU module API often applies to the "current"
CPU, to which we are bound either by context (e.g., in an interrupt
handler when we call the function) or by API consumer requirement
(they must bind or otherwise prevent migration before calling).
The changes to allow for virtualisation will likely introduce
a "handle" argument that callers will pass and which will in some
way identify the real target CPU.
Most CPU module API members are implemented by calling into a
CPU module implementation that has been associated with the
target cpu (by vendor/family/model/stepping if available, or the generic
implementation otherwise). For example, cmi_mca_init simply calls
the corresponding MCA initialization function in whatever CPU module
has been loaded for the target CPU.
CPU module implementations may themselves use some of the cpu module
interface, with the obvious exception that they should not use
an API member that will call back into the implementation which will
call back to the API again etc! The only members expected to be used
in this way are cmi_rdmsr and cmi_wrmsr, which should be used in
implementation of cmi_mca_init for example.
API Success/Failure Semantics
-----------------------------
The common semantic is that an API function should either always
work, or may fail without error indication provided that failure cannot
cascade on to future misbehaviour (e.g., cmi_init failure should not
cause a panic if the consumer later calls cmi_mca_init). These API members
are of type void.
Where a success or failure indication is required we return a
cmi_errno_t; anything other than CMI_SUCCESS is a failure.
A specific failure reason should be indicated, or CMIERR_UNKNOWN
as a catch-all.
cmi_init
--------
extern void cmi_init(void);
Initialises the CPU module interface for the current cpu; the caller
is responsible for ensuring that the "current" cpu cannot change
for the duration of this call. This function should only be called
once for each cpu (for the Solaris image lifetime or, if DR applies,
for as long as this is a valid cpu in the system).
cmi_init should always succeed, since if no more-specific support is
found it will always fallback to generic x86 cpu support (cpu.generic).
If it does not succeed there is no error indication, but it is still
safe to proceed to call other API members (all of which should short-circuit
and fail safely).
cmi_init first attempts to load a model-specific cpu module by searching
in directory /platform/i86pc/kernel/cpu (or /platform/i86pc/kernel/cpu/amd64
if running 64-bit) for a module whose filename matches in the following
list, in order of preference:
cpu....
cpu...
cpu..
cpu.
If no matching module is found we fallback to cpu.generic in the
same directories.
cmi_post_startup
----------------
extern void cmi_post_startup(void);
This is called during post-startup processing for the boot processor
only, as part of post_startup. Multiprocessor startup has not commenced
at this point. This is an opportunity for the CPU module implementation
to perform operations that apply to the platform rather than individual
cpus, such as to request that the BIOS cease SMI polling of MCA state.
cmi_post_mpstartup
------------------
extern void cmi_post_mpstartup(void);
This is called exactly once - from start_other_cpus once all processors
have started. Which cpu we are running on is undefined. This serves
as a hook to load or initialize other functionality as soon as possible
after all cpus are initialized and cmi_init has been called on all
cpus. An example would be to forceload a memory-controller driver at
this point.
cmi_faulted_enter, cmi_faulted_exit
-----------------------------------
extern void cmi_faulted_enter(struct cpu *);
extern void cmi_faulted_exit(struct cpu *);
cmi_faulted_enter is called when the indicated cpu enters the
CPU_FAULTED state. Which cpu we are running on is undefined - it
certainly is not the newly-faulted cpu, which has been offlined.
CPU module implementations may utilize this hook to further isolate
the faulted cpu.
cmi_faulted_exit is called when the indicated cpu leaves the CPU_FAULTED
state.
cmi_wrmsr, cmi_rdmsr
--------------------
extern cmi_errno_t cmi_wrmsr(uint_t, uint64_t);
extern cmi_errno_t cmi_rdmsr(uint_t, uint64_t *);
Write or read the given MSR (first argument). The operation is
performed under on_trap for OT_DATA_ACCESS protection, so any GPF
or similar that could result will be forgiven.
If cmi_mca_msrinterpose has been used to interpose an MSR value
then cmi_rdmsr will return any interposed value for the requested
MSR. cmi_wrmsr always performs a WRMSR attempt without interposition.
These return CMI_SUCCESS for success, or CMIERR_UNKNOWN on failure.
cmi_mca_init
------------
extern void cmi_mca_init(void);
This should be called on every cpu after cmi_init is called. It calls
through to the CPU module implementation to perform MCA initialization
of that cpu.
cmi_mca_msrinject
-----------------
extern cmi_errno_t cmi_mca_msrinject(cmi_mca_regs_t *, uint_t, int);
This interface supports the error injector driver 'memtest', enabling
writing to specified MSRs via the CPU module instead of with a raw
WRMSR (or even cmi_wrmsr) since the implementation may know of some
suitable enabler that makes certain MCA MSRs writeable.
If the CPU module implementation for the current cpu does not
support writing to MCA MSRs this call fails with CMIERR_NOTSUP;
it never attempts a WRMSR itself.
The first argument points to an array of MSR registers offsets and
desired new values, sized per the second argument. The last
argument is a 'force' flag which is passed along to the CPU module
implementation as an indication that if it does not know a safe way
of writing these MSRs (such as clearing a lock bit first) then it
should simply attempt a WRMSR via cmi_wrmsr.
cmi_mca_msrinterpose
--------------------
extern cmi_errno_t cmi_mca_msrinterpose(cmi_mca_regs_t *, uint_t);
If a platform does not permit writing to MCA MSRs (e.g., it produces
a GPF if a nonzero value is written to MCi_STATUS for any bank)
then the injector can choose to fallback to interposing via
cmi_mca_msrinterpose. This interface does not perform any WRMSR
operations, but simply copies the desired MSR values for later
return when cmi_rdmsr is used (raw RDMSR will not see interposed
values).
cmi_mca_poke
------------
extern void cmi_mca_poke(void);
This interface serves the error injector driver 'memtest'.
This "pokes" any poller implemented in the CPU module to wakeup
now rather than at the next scheduled wakeup.
cmi_mc_register
---------------
extern void cmi_mc_register(struct cpu *, const struct cmi_mc_ops *, void *);
The memory-controller driver typically attaches to a NorthBridge
PCI device and function; such attaches occur quite late in startup,
certainly well after cpu startup and associated cmi_init. Once
the memory-controller driver is ready for service it should register
itself with the CPU module instance for each CPU that it associated
with the memory controller instance using this interface. The last
argument is private to the memory-controller driver and will be
quoted back to it during other MC operations.
cmi_mc_scrubber_enable
----------------------
extern cmi_errno_t cmi_mc_scrubber_enable(struct cpu *);
If a memory-controller driver has registered for the given cpu then
call into it to request initialisation of any hardware memory
scrubber. Returns CMIERR_MC_ABSENT if no memory controller has registered,
CMIERR_MC_NOMEMSCRUB if the registered driver offers no entry point for
enabling scrubbing, some other CMIERR_MC* error on failure, or CMI_SUCCESS.
cmi_mc_patounum
---------------
extern cmi_errno_t cmi_mc_patounum(uint64_t, uint8_t, uint8_t, uint32_t, int,
mc_unum_t *);
If a memory-controller driver has registered call into it to
translate the given physical address (with highest and lowest valid
bits indicated by 2nd and 3rd arguments) and syndrome/syndrome-type
into a completed mc_unum_t structure which identifies the location of
the memory address in terms of node/chip-select/rank/channel/branch etc.
XXFM Need to specify mc_unum_t
cmi_mc_unumtopa
---------------
extern cmi_errno_t cmi_mc_unumtopa(mc_unum_t *, nvlist_t *, uint64_t *);
Given either a valid mc_unum_t structure or an FMRI, reconstitute a
physical address from the information therein. This is used in fault
cache replay to identify the addresses affected by a known bad resource
(identified by parameters that do not change when something like an
interleave factor is changed in the system configuration).
|