Running code before main in Rust
Lynx March 21, 2025 #MalDev #RustToday, I will explore the potential of executing user-defined code before the main function in Rust. This can be
accomplished using Thread-Local Storage (TLS) callbacks or by leveraging C Runtime (CRT) behavior. I will walk through
the implementation of TLS callbacks in Rust and delve into the details of the CRT, demonstrating how to use it to run
custom code.
Thread Local Storage Callbacks
Going with the official Microsoft documentation, TLS callbacks can be defined as functions that support the construction
and destruction of data objects for Thread-Local Storage. This might sound cryptic and doesn't explain much at first,
but callbacks mainly allow you to manage the lifetime of objects that will be accessed at the thread level. You can think
of them as a kind of constructors/destructors for such objects, which are called whenever a thread is spawned or terminated. This means that callbacks have the ability to execute code before execution reaches the main function. Malware developers often exploit this fact to implement various techniques, such as anti-debugging or even actual malicious code execution.
The responsibility for calling these functions lies with the Windows Loader. Callback functions are stored in an array,
which can be accessed through a pointer stored in the so-called TLS Directory, which is defined in the PE
Optional Header Data Directories.
The TLS Directory is a simple struct that contains a field named AddressOfCallbacks. This field is a pointer to an
array of defined TLS callbacks. Each element of this array is the address of one callback function. The Windows loader
traverses this array and invokes these functions in a defined order.
As Microsoft documentation states, most typical programs will have only one callback function, if any. The screenshot
below shows a default Rust Hello World program compiled in release mode. As you can see, there is one callback
function defined at address 0x140009AE0, which was also shown in Ghidra. This function is the "default" callback
function generated by the Rust compiler. To the best of my knowledge, it is always present in Rust binaries, at least
when default settings are applied.
After checking references to that callback function, I was led to a subsection named .CRT$XLB.
From there, this symbol was referenced by three functions:
std::rt::lang_start_internal,std::sys::thread_local::destructors::list::register,std::thread::current::init_current.
I believe the names of these functions are self-explanatory, and they clearly show their connection with threads.
The lang_start_internal is a function that runs the actual main function (the function that will print "Hello world"),
and it is launched after the CRT has done its job.
So, now that we know what callbacks are and where they are placed (in the TLS Directory and the .CRT$XL subsection),
let's move on to the implementation details.
Implementing TLS callbacks
I'll show you two approaches for creating TLS callbacks in Rust. The first is inspired by methods commonly seen in C language implementations, and the second is based on solutions found in the rust-ctor crate. I'll start with the C-like implementation and explain details about the CRT subsections and theirs naming conventions.
C-like way
For the C-like approach, the callback function should have the following prototype, which is enforced by the
PIMAGE_TLS_CALLBACK type:
typedef VOID ;
In Rust, the callback function will look like this:
extern "system"
The important thing is to mark callback function as extern "system". By doing so, the compiler will generate this
function in a way that is compatible with the Windows ABI.
To register this function as a TLS callback, define the following static variable:
static TLS_CALLBACK_1: PIMAGE_TLS_CALLBACK = Some;
Let's begin with its type. PIMAGE_TLS_CALLBACK is encapsulated in an Option in the windows crate, which is why
I've used Some to assign it to the TLS_CALLBACK_1 variable. Now, let's talk about the two procedural macros.
#[used] ensures that the variable won't be optimized out, especially in release mode. #[link_section = ".CRT$XLB"]
guarantees that the variable will be placed in the designated section of the PE file. In this case, I'm using the
special CRT subsection, which is associated with the C Runtime, as the name suggests. Next, after the $ character
comes the group name or subsection name (you can refer to it in both ways). The subsection name is generated
dynamically by the compiler, but when writing a TLS callback, you must define it yourself. Choosing a name for the
subsection can be problematic since Microsoft doesn't provide a list of names or a detailed explanation of the CRT's
inner workings. However, based on research presented in the previous chapter, I stuck with the XL subsection.
I would also advise avoiding subsections ending with A or Z (e.g., XLA), as they point to the beginning and end
of the subsection, and may have special significance.
And basically, that's all. After running the program, you should first see a message box, and then, when you dismiss it, "Hello from main!" will be printed to the console.
As you can see, the callbacks were executed twice: once before and once after the program printed the "Hello from main" message. This occurs because callbacks are triggered both when the thread is created and when it is terminated.
Ctor way
Before I dive into reversing binaries, I'll show you the second approach. It differs in that I don't split the global variable and function definition. Instead, everything is stored in a single block:
static TLS_CALLBACK_2: extern "C" fn = ;
At the end of the day, both methods produce the same results when it comes to callbacks. So, without further ado, let's move on to the next part.
Reversing
I've created a Hello World program that involves the use of two callbacks, which utilize the implementations described
earlier. Analyzing the TLS Directory shows that there are three callbacks:
Here are two custom callbacks in Ghidra. As you can see, the addresses 0x140001040 and 0x140001064 match those
presented on the previous screen.
Now, going to the .CRT$XL subsection reveals that the callbacks I defined are also present here:
There aren't any additional references to custom callbacks in the CRT subsection. However, when you look again at the
previous screenshot, you'll see that both callbacks are referenced by Entry Point(*), which is an external reference,
meaning the functions are called by other programs - most likely by the Windows Loader.
Additionally, as you may have observed earlier, these callbacks are invoked twice - once before and once after main.
This reflects the initialization/destruction behavior of the callbacks. Furthermore, callbacks appear to be executed in
alphabetical order based on the CRT subsection names, which Microsoft has confirmed, as I will demonstrate in the next section.
Earlier, I mentioned that I stuck with the .CRT$XL subsection. But what would happen if you chose a different
subsection, like .CRT$AAA or something else? Well, if you're lucky, you might hit a "free" subsection, and your
code will still work. However, your functions may disappear from the callback array (though they might still be invoked).
You may wonder how this is even possible. Well, here's the next part to explain that.
CRT and pre-main code
When trying to search for information about CRT subsections, you may notice that rust-ctor uses .CRT$XCU. You might even
come across this article, where it's stated that the Microsoft C++ compiler uses the XCU subsection for global initializers.
Furthermore, the same article says the following:
The names .CRT$XCT and .CRT$XCV aren't used by either the compiler or the CRT library right now, but there's no guarantee that they'll remain unused in the future. And, your variables could still be optimized away by the compiler. Consider the potential engineering, maintenance, and portability issues before adopting this technique.
Aside from that, there is nothing more I could find about subsection names in the publicly available documentation.
Based on the available documentation, we know that Microsoft uses XCU, and other names may or may not be reserved.
Before you start "spraying and praying" when choosing a subsection name, let me present my own spraying results.
I observed some tendencies between subsection names and function behavior:
- Functions defined in groups from
XLBtoXLZwere visible in the callbacks array and were executed twice. These are actual TLS callbacks. - Functions defined in groups from
XCBtoXCZweren't visible in the callbacks array but were executed only once. - Functions defined in the
XIZgroup were executed but caused aSTATUS_ACCESS_VIOLATION, causing the program to crash.
As you can see, picking random names isn't very reliable, and results may vary depending on your OS version or the
toolchain used to build the executable. The problem is that we don’t know exactly what each subsection stores, so
considering that Microsoft uses XCU and that the authors of the rust-ctor crate also use that subsection, we could
simply stick with it, as well as XLZ, and accept those results. However, I wasn't satisfied and wanted more answers
to my questions.
At this point, I compiled the program again, this time with two callbacks, but both defined in the XCU subsection.
You can store multiple pointers in one subsection, as the following listing from Microsoft's documentation suggests:
.CRT$XCA
__xc_a
.CRT$XCU
Pointer to Global Initializer 1
Pointer to Global Initializer 2
.CRT$XCZ
__xc_z
Keep in mind that from now on, every screenshot I show will reference a program with "callbacks" (though, in this case,
they aren't actually callbacks, but this will be explained later) stored in the .CRT$XCU subsection.
As I dug deeper into this topic, I came across this article, which led me to analyze the CRT source code. In this source code, there is a small list that explains what each particular subsection stores:
extern _PIFV __xi_a;
extern _PIFV __xi_z; /* C initializers */
extern _PVFV __xc_a;
extern _PVFV __xc_z; /* C++ initializers */
extern _PVFV __xp_a;
extern _PVFV __xp_z; /* C pre-terminators */
extern _PVFV __xt_a;
extern _PVFV __xt_z; /* C terminators */
As you can see, Microsoft didn’t lie in their documentation when they said that XCU stores global initializers
placed there by the C++ compiler. However, this is based on an older source, and Microsoft now uses something called
UCRT (Universal C Runtime). I found the UCRT source code on my disk, located at
X:\Windows Kits\10\Source\10.0.22621.0\ucrt (for Visual Studio 2022 installation). When searching for .CRT
occurrences, there wasn’t a general list like in the older version. Instead, I got more specific insights into what is
stored in particular subsections.
I decided to search for the __xc_a identifier, which points to the beginning of the XC subsection, and this led me to
the __initterm function.
Next, I opened Ghidra and found references to the initterm function, which brought me to the __scrt_common_main_seh
where initterm is invoked:
Now, jumping to the __xc_a symbol, here they are - pointers to the defined callback functions.
Now, regarding the initterm function code, it is as follows:
// Calls each function in [first, last). [first, last) must be a valid range of
// function pointers. Each function is called, in order.
extern "C" void __cdecl
It becomes obvious that initterm traverses the XCU subsection as an array and invokes each function stored in that
array. Since initterm, as part of the CRT, is invoked before the main function, it provides the possibility to
execute user-defined code before main.
As I mentioned earlier, when a function is stored in the XCU subsection, it’s not visible as a callback function in
the TLS Directory:
This is because, technically, those functions aren’t callbacks, and they are stored in a different part of the file.
If you want your function to be an actual callback, its address must be present in the array pointed to by the
AddressOfCallbacks field, which is stored in the TLS Directory. To do so, you must define your callback inside the
XL subsection, as explained in the TLS callback chapter.
When you define your functions in the .CRT$XCU subsection, they shouldn’t take any arguments because initterm
doesn’t pass arguments to them.
static INIT_FUNCTION_1: extern "C" fn = init_function_1;
extern "C"
static INIT_FUNCTION_2: extern "C" fn = ;
Defining functions in different areas of the XC subsection, like XCB or XCW, will also involve the initterm
function. However, depending on the last letter, custom functions may be called either before or after the
initialization of C++ globals (XCU subsection).
Similarly to initterm, there is the initterm_e function, which operates on the XI subsection.
Summary
To summarize, in order to execute code before main, you can use either TLS Callbacks or the initterm function.
- TLS Callbacks are defined within the
.CRT$XLsubsection and must be stored as pointers to thePIMAGE_TLS_CALLBACKtype. These callbacks are called twice: once at the thread starts and once at the thread ends. - Functions executed by
inittermare stored in the.CRT$XCsubsection and are processed as pointers to functions that do not return any value and do not take any arguments. You can define multiple functions within one subsection. - Different CRT subsections are called in alphabetical order.