Saturday, January 30, 2010

Microsoft Intermediate Language and the JITters

To make it easy for language writers to port their languages to .NET, Microsoft developed a language akin to assembly language called Microsoft intermediate language (MSIL). To compile applications for .NET, compilers take source code as input and produce MSIL as output. MSIL itself is a complete language that you can write applications in. However, as with assembly language, you would probably never do so except in unusual circumstances. Because MSIL is its own language, each compiler team makes its own decision about how much of the MSIL it will support. However, if you're a compiler writer and you want to create a language that does interoperate with other languages, you should restrict yourself to features specified by the CLS.
When you compile a C# application or any application written in a CLS-compliant language, the application is compiled into MSIL. This MSIL is then further compiled into native CPU instructions when the application is executed for the first time by the CLR. (Actually, only the called functions are compiled the first time they are invoked.) However, since we're all geeks here and this book is called Inside C#, let's look at what's really happening under the hood: -
  1. You write source code in C#.
  2. You then compile it using the C# compiler (csc.exe) into an EXE.
  3. The C# compiler outputs the MSIL code and a manifest into a read-only part of the EXE that has a standard PE (Win32-portable executable) header.
  4. So far, so good. However, here's the important part: when the compiler creates the output, it also imports a function named _ CorExeMain from the .NET runtime.
  5. When the application is executed, the operating system loads the PE, as well as any dependent dynamic-link libraries (DLLs), such as the one that exports the _ CorExeMain function (mscoree.dll), just as it does with any valid PE.
  6. The operating system loader then jumps to the entry point inside the PE, which is put there by the C# compiler. Once again, this is exactly how any other PE is executed in Windows.
  7. However, since the operating system obviously can't execute the MSIL code, the entry point is just a small stub that jumps to the _ CorExeMain function in mscoree.dll.
  8. The _ CorExeMain function starts the execution of the MSIL code that was placed in the PE.
  9. Since MSIL code cannot be executed directly-because it's not in a machine-executable format-the CLR compiles the MSIL by using a just-in-time (JIT) compiler (or JITter) into native CPU instructions as it processes the MSIL. JIT compiling occurs only as methods in the program are called. The compiled executable code is cached on the machine and is recompiled only if there's some change to the source code.
Three different JITters can be used to convert the MSIL into native code, depending on the circumstances: -
  • Install-time code generation Install-time code generation will compile an entire assembly into CPU-specific binary code, just as a C++ compiler does. An assembly is the code package that's sent to the compiler. (I'll talk about assemblies in more detail later in this chapter in "Deployment.") This compilation is done at install time, when the end user is least likely to notice that the assembly is being JIT-compiled. The advantage of install-time code generation is that it allows you to compile the entire assembly just once before you run it. Because the entire assembly is compiled, you don't have to worry about intermittent performance issues every time a method in your code is executed the first time. It's like a time-share vacation plan in which you pay for everything up front. While paying for the vacation plan is painful, the advantage is that you never have to worry about paying for accommodations again. When and if you use this utility depends on the size of your specific system and your deployment environment. Typically, if you're going to create an installation application for your system, you should go ahead and use this JITter so that the user has a fully optimized version of the system "out of the box."
  • JIT The default JITter is called at run time-in the manner I described in the preceding numbered list-each time a method is invoked for the first time. This is akin to a "pay-as-you-go" plan and is the default if you don't explicitly run the PreJIT compiler.
  • EconoJIT Another run-time JITter, the EconoJIT is specifically designed for systems that have limited resources-for example, handheld devices with small amounts of memory. The major difference between this JITter and the regular JITter is the incorporation of something called code pitching. Code pitching allows the EconoJIT to discard the generated, or compiled, code if the system begins to run out of memory. The benefit is that the memory is reclaimed. However, the disadvantage is that if the code being pitched is invoked again, it must be compiled again as though it had never been called.

No comments:

Post a Comment