.NET CLR process internals
Last updated
Was this helpful?
Last updated
Was this helpful?
The Windows operating system architecture consists of several layers and components, each developed over time to meet specific commercial and technological requirements. These components work together to provide a robust and flexible platform for applications and services.
One significant example of this evolution is the introduction of the Distributed Component Object Model (DCOM). DCOM was created to extend the capabilities of the original Component Object Model (COM) by enabling communication between COM components across networked computers. By leveraging the Microsoft Remote Procedure Call (MSRPC) protocol, DCOM allows programs to remotely invoke methods on COM objects residing on different machines within a network. This means that applications can utilize the functionality of remote components without having the binaries installed locally, facilitating distributed computing and resource sharing.
This architecture is crucial for enterprise environments where applications are often distributed across multiple servers for scalability, load balancing, and redundancy. DCOM's ability to provide location transparency and remote interaction simplifies the development of networked applications and services.
Over time, Windows has continued to evolve its architecture to meet new demands. Technologies such as the .NET Framework and Windows Communication Foundation (WCF) were introduced to provide improved interoperability, security, and ease of development for networked applications. The Universal Windows Platform (UWP) further extends this by allowing developers to create applications that run across a range of Windows devices with a single codebase.
Just to recapitulate, windows subsystem apis can be listed as following:
Windows API (“Win32”)
Classic C API from the first days of Windows NT
COM based APIs
Especially in newer (Vista+) APIs
Examples: BITS, DirectX, WIC, DirectShow, Media Foundation, Task Host •
.NET
Managed libraries running on top of the CLR
Windows Runtime (WinRT)
New unmanaged API available for Windows 8+
Built on top of an enhanced version of COM
The Native API
Implemented by NtDll.dll
This article does not cover completely Ahead-On-Time compilation features that some .NET binaries ship with specially those that were compiled with NGEN, neither the self-contained .NET apps that does not uses clrjit.dll
I will focus on the .NET managed libraries and the processes that run on top of the Common Language Runtime (CLR). The CLR is the execution engine for .NET applications, providing a managed environment for code execution. Every .NET binary, known as an assembly, is loaded and executed within this environment. The CLR is responsible for critical functionalities such as threading, memory management, security enforcement, garbage collection, and Just-in-Time (JIT) compilation. As a result, .NET code does not manage threading at a low level directly; instead, threading is abstracted and handled by the runtime.
When a .NET application starts, the process first executes a small stub responsible for loading the CLR into the process's address space. This stub acts as a bridge between the operating system and the managed code of the CLR. Once the CLR is loaded, it begins executing the managed code of the application.
When a .NET method is invoked for the first time, the CLR's JIT compiler compiles the Intermediate Language (IL) code of that method into native machine code suitable for the processor architecture (such as x86 or ARM). The compiled native code is then stored in memory for subsequent calls to the method, enhancing performance by avoiding recompilation. Importantly, not all IL code is compiled at once; instead, methods are compiled on-demand, the first time they are called. This approach is known as "lazy JIT compilation." After a method is compiled and allocated, the CLR updates the method's call site to point directly to the compiled native code, effectively bypassing the stub on future calls.
Regarding the application's entry point, for .NET Framework versions earlier than 3.5, the actual entry point of a .NET executable is the unmanaged function MSCOREE!ShellShim__CorExeMain
. Starting from .NET Framework 4.0, the entry point is mscoree!__CorExeMain
, which then calls the clr!__CorExeMain
stub. This stub primarily initializes the process's Structured Exception Handling (SEH) and then calls CorExeMainInternal
.
CorExeMainInternal
performs several critical initializations:
Garbage Collector Initialization: Sets up the garbage collector, which manages memory allocation and reclamation.
Exception Handling Setup: Initializes CLR exception objects to handle exceptions in managed code.
Event Tracing for Windows (ETW): Sets up ETW tracing routines for profiling and debugging.
Finally, it calls the ExecuteEXE
method, which uses the static method SystemDomain::ExecuteMainMethod
to start executing the application's Main
method.
The CLR JIT compiler operates by loading assemblies as needed. When the JIT compiler encounters a reference to a type that resides in an assembly not yet loaded, it loads that assembly at that time. This on-demand assembly loading can reduce the application's startup time by deferring the loading of assemblies until they are actually needed.
I'll name it ClassLibrary1.dll, we will add the reference to the target binary, lets name it ConsoleApp2.exe
We can use System.Reflection to enumerate the loaded assemblies in the Default AppDomain or the current domain of the execution.
As expected even when the library were referenced, .NET CLR does not load the assembly into the AppDomain right away, because there's no invocation or type reference.
ClassLibrary1 gets loaded by AppDomain as soon as any object of the class is instantiated in the main function of the Program (entrypoint)
Becomes more obvious that CLR loads the assembly only when jitted method references the library, as a method invoke, instance creation or reference to the type as done type typeof(Class)
The internals of the .NET runtime are documented inside: