[−][src]Module sunrise_kernel::cpu_locals
CPU local storage
We want some statics to be cpu-local (e.g. CURRENT_THREAD
). We could implement this fully
in software, by having an area of memory that is replicated for every cpu core, where
statics are indexes in this memory area, and provide getters and setters to access and modify
the cpu-local statics.
However this is not ideal as it is not really optimized, and pretty tedious.
Instead we use the very common concept of Thread Local Storage (TLS), and apply it to cpu cores instead of threads, and let the compiler do all the hard work for us.
Usage
In the kernel you declare a cpu-local using the #[thread_local] attribute :
#[thread_local] static MY_CPU_LOCAL: core::cell::Cell<u8> = core::cell::Cell::new(42);
and access it as if it was a regular static, only that each cpu core will have its own view of the static.
The compiler is responsible for generating code that will access the right address, provided we configured TLS correctly.
Early boot
Note that you can't access a cpu-local static before init_cpu_locals
is called, because
the cpu-local areas arent' initialized yet, and this will likely result to a cpu exception
being raised, or UB.
This means you can't ever access cpu-locals in early boot. If your code might be called during
early boot, we advise you to use ARE_CPU_LOCALS_INITIALIZED_YET
to know if you're allowed
to access your cpu-local static, and if not return an error of some kind.
Inner workings
We implement the TLS according to conventions laid out by Ulrich Drepper's paper on TLS which is followed by LLVM and most compilers.
Since we're running on i386, we're following variant II.
Each cpu core's gs
segment points to a thread local memory area where cpu-locals statics live.
Cpu-local statics are simply accessed through an offset from gs
.
Those regions can be found in CPU_LOCAL_REGIONS
.
The linker is in charge of creating an ELF segment of type PT_TLS
where an initialization image
for cpu local regions can be found, and is meant to be copy-pasted for every thread we create
cpu core we have.
Segmentation
Each core gets its own GDT. In each of these there is a KTls
segment which points to this
core's cpu-local area, and which is meant to be loaded into gs
.
Because userspace might want to use Thread Local Storage too, and also needs gs
to point to its
thread local area (see set_thread_area
), we swap the segment gs
points to everytime
we enter and leave the kernel in trap_gate_asm
, from UTls_Elf
to KTls
and back.
TLS on x86 are really weird. It uses the variant II, where offsets must be subtracted from gs
,
even though segmentation only supports adding offsets. The only way to make them work is to have
gs
segment's limit be 0xffffffff
, effectively spanning the whole address space, and when
the cpu will add a "negative" (e.g. 0xfffffffc
for -4) offset, it will treat it as an unsigned
huge positive offset, which when added to gs
's base will "wrap around" the address space,
and effectively end up 4 bytes behind gs
's base.
Illustration:
dtv and __tls_get_addr
We're the kernel, and we don't do dynamic loading (no loadable kernel modules).
Because of this, we know our TLS model will be static (either Initial Exec or Local Exec).
Those models always access thread-locals directly via gs
, and always short-circuit the dtv.
So we don't even bother allocating a dtv array at all. Neither do we define a __tls_get_addr
function.
Structs
CpuLocalRegion | Represents an allocated cpu local region. |
ThreadControlBlock | Elf TLS TCB |
Statics
ARE_CPU_LOCALS_INITIALIZED_YET | Use this if your code might run in an early boot stage to know if you're allowed to access a cpu-local variable. Accessing one when this is false is UB. |
CPU_LOCAL_REGIONS | Array of cpu local regions, copied from the initialization image in kernel's ELF. |
Functions
get_cpu_locals_ptr_for_core | Address that should be put in |
init_cpu_locals | Initializes cpu locals during early boot stage. |
tls_align_up | The |