The Common Language Runtime -- FreeWindow-- 编程爱好者博客

COM led to a lot of great software and some very useful systems over the years. However, during its lifespan, the problems we just discussed emerged time and again (and we haven’t even mentioned the problems with DCOM). The initial impetus behind COM was the question “How can we compose already-compiled binary software that was developed using different languages and tools?” The answer was “Build reliable, predictable bridges between separate components.” COM concentrated on a well-established boundary—the one between the client and the object. COM makes sure the boundary is well-defined and named so there’s no ambiguity between the client and the object.

It eventually became evident that the boundary between the client and the object didn’t necessarily have to exist at all. What if a runtime environment were available that dissolved the boundary between the client and the object? That’s what the common language runtime is all about—erasing the boundaries between components. The common language runtime basically reframes the entire component development problem.

No Boundaries

Recall that one of the biggest problems with COM is that different development environments work with different data types. The disparity in data types solidifies the boundary between the client and the DLL.

As mentioned, the main idea behind the common language runtime is to erase the boundaries between components. It does this in two ways: by providing a common runtime environment for components to live in, and by establishing a common type system (CTS). Any components targeted to live within the common language runtime must base themselves on the CTS. (We’ll discuss CTS in detail later in this chapter.)

Figure 31-1 shows how COM components bridge DLL boundaries. Figure 31-2 shows how all common language runtime objects live within the same runtime and don’t have to make boundary crossings.

Figure 31-1. COM boundary crossings.

Figure 31-2. Common language runtime components don’t have to worry about boundary crossings.

COM provided a modest amount of type information with its components, but the type information was sometimes incomplete due to the disparities between programming environments. The common language runtime and .NET development tools fix this. With the common language runtime and its pervasive type system, components reflect themselves accurately. You can know anything you want to about the code at run time and development time.

As you’ll see later, the common language runtime provides garbage collection, memory layout management, and security control. To perform these services effectively, it has to know everything about the code that it’s hosting. In fact, types living within the runtime are called managed types because all aspects of their creation and execution are managed by the runtime.

Mscoree.dll includes the basic functionality for the common language runtime. Another DLL, Mscorlib.dll, comprises the runtime library. Mscoree.dll is an unmanaged DLL that provides loading functionality and runtime services. Mscorlib.dll is a managed DLL that contains the core types used throughout the system. Your own managed executables use both Mscoree.dll and Mscorlib.dll.

It’s All About Type

In normal C development, type consciousness was optional—everything was basically some form of an integer that you could cast any way you wanted to. C++ raised the bar of type-consciousness. However, you could easily defeat the C++ type system using a cast.

In .NET, types are king. Everything is a well-defined type—from the lowliest integer to the most complex class. All types within the common language runtime derive from a fundamental system type named System.Object. This is a bit different than in classical programming environments such as C++, whose types (primitives such as int, long, and char) mostly denote memory usage. Common language runtime types have built-in reflection and the facilities of System.Object at their disposal. System.Object is analogous to the VARIANT commonly found in COM’s scripting interfaces, because a variant includes the data type (not just the data). You can always interrogate a VARIANT to find out what kind of data it represents. System.Object is also similar to MFC’s CObject class because System.Object provides some fundamental services that are useful to both the runtime and developers. Some of the more useful System.Object functions include Equals, GetType, ToString, Finalize, and MemberwiseClone.

Equals determines whether two instances of a type are equal. GetType returns the type of an instance at run time, in much the same way that the CObject::IsKindOf method works in MFC. ToString returns a string representing the type of the instance. Finalize tells the object to free up resources and carry out other cleanup operations before being swept away by the garbage collector. MemberwiseClone is like the copy constructor in C++; it performs a deep copy of an instance of a common language runtime type.

Classic C++ supports composing your own types using the typedef statement or by defining structures and classes. When you define a type within the context of C++, you’re telling the compiler about the structure of the type.

The common language runtime also supports composing your own types. But because custom types also derive from System.Object, these custom types automatically include type information and the other services provided by System.Object.

In summary, one of the most important goals of the common language runtime is to support cross-language programming. To accomplish this, it extends the notion of type much further than C++ or COM do. For example, types within a C++ program are restricted to that language, and types within a Visual Basic program are restricted to the Visual Basic runtime. Types within the common language runtime must adhere to the rules of the common type system.

Common Language Runtime Types

As a C++ developer, you’re probably accustomed to using C++ types denoted by such keywords as long, float, and class. However, if you met a Visual Basic 6.0 developer on the street and began talking about C++ types, he would have a very different notion of what you were talking about. Different development environments define their data types differently. The .NET approach is to define types within the context of a common runtime environment.

In the mid-1990s, each software development environment had its own runtime support. For example, Visual Basic 6.0 has its own runtime engine, Vbrun.dll. The data types within Visual Basic 6.0 are managed by Vbrun.dll. MFC has its own runtime support DLL as well: MFCxxx.dll (with xxx denoting whatever the current version is). The same goes for ATL, which has its own support DLL. Rather than depend on a specific language or on runtime support from a specialized library, .NET code relies on a single type system, a common runtime engine, and a common class library. Component integration is much easier because all the components of an application work with the same data types. Interop issues between .NET components are virtually nonexistent.

The basis for the common language runtime is the fact that data types are the same for every component running under the runtime. To enforce type compatibility between components, types targeted for the common language runtime must adhere to the CTS at run time. The CTS defines rules for various language implementations to follow.

The CTS defines several types, including value types, reference types, enumerations, arrays, delegates, interfaces, and classes. It also defines a pointer type for interoperating with unmanaged code (code not running within the common language runtime). Following is a rundown of each of these types.

Value Types

Value types represent flat values—data that takes up some flat memory as opposed to reference types that “point” to other types. When value types are copied across function calls as parameters, they are literally copied from the caller’s context to the callee’s context. The .NET common language runtime supports two kinds of value types: built-in value types and user-defined value types. Built-in value types include types such as System.Int32 and System.Boolean. User-defined types are composed from primitive types and include structures. A good example of a user-defined type is a collection of coordinates that define a shape. Because value types simply define memory layout, they do not have the overhead associated with class. Value types are handled very efficiently by the runtime.

Reference Types

Whereas a variable of value type contains a value of that type, a variable of reference type is more akin to C++ pointers and contains a reference of that type. Reference types are managed by the runtime and live on the garbage-collected heap.

Boxing and Unboxing

Because of how value types and reference types differ, you sometimes need to convert value types to reference types. This process is known as boxing. Let’s say you run across a function call that takes a reference type in the parameter list, and as the caller you hold only a value. If you try to pass a value type where a reference type is required, you’ll encounter a runtime error. You can box the object, which will clone the object and create a reference to it. When boxed objects are copied back into the instance, this is known as unboxing. The managed C++ includes keywords for boxing and unboxing types, as you’ll see in Chapter 32.

Enumerations

As a seasoned C++ developer, you’re probably familiar with the C++ enum keyword, which defines a sequence as a type in the C++ type system. Enumerations as defined by the CTS are a special form of value type; they inherit from System.Enum. Enumerations are useful for describing collections such as the days of the week (Monday, Tuesday, Wednesday, and so forth ) and months of the year (January, February, March, and so on). In classic C-style programming, you’d probably assign the values 1 through 12 to represent the months of the year, like so:

enum Months {
   January = 1,
   February,
   March,
   April,
   May,
   June,
   July,
   August,
   September,
   October,
   November,
   December
};

You can create variables of type Months, but the data type underlying the variable is an integer so you can just as easily use the number 2 whenever the month February is required. Using enumerations provides a higher level of type safety and code readability than when you use primitive types. One problem in C++ is that there’s no way to relate the numbers of the month to their names except by writing some extra code. The strongly typed enumerations available in .NET get rid of this problem. When you declare an instance of a .NET enumeration, you can assign it a value from the enumerators defined in enumerations. We’ll see an example of enumerations in Chapter 32.

The methods available through .NET enumerations include all the members from System.Object and the methods available from System.Enum. The System.Enum functions include Format, GetNames, GetUnderlyingType, GetValues, IsDefined, Parse, and ToObject.

Arrays

Arrays are homogenous and can hold only elements of a single type. In the unmanaged world we used to live in, arrays were just blocks of memory. Languages such as C and C++ provided syntax for indexing into arrays. Class libraries such as MFC and the standard template library (STL) provide useful classes for managing arrays without the headaches associated with managing raw pointers. For example, MFC includes a CObArray class that includes methods for adding and deleting objects from the array. Visual Basic 6.0 developers are used to working with arrays, too. However, a Visual Basic array ends up as a SafeArray when it’s described with type information. As a C++ developer, catering to the Visual Basic 6.0 crowd means defining arrays using the COM SAFEARRAY structure (which is a self-describing multidimensional array of type VARIANT).

The CTS defines an array type that works the same no matter what environment you’re working in. .NET arrays derive from System.Array and work similarly to STL-based arrays and MFC-based CObArrays. They grow as necessary and include functionality for adding and deleting elements, counting elements, and getting elements from specific positions within the array. You’ll see an example of a managed array in Chapter 32.

Delegates

Any C++ developer who has worked with Windows for a while has dealt with function pointers. When you define function pointer types in C++, you describe a call stack that the compiler understands. In this way, you can have various sections of your code calling back and forth.

Delegates inherit from System.Delegate. Within the context of the CTS, delegates serve a similar purpose. Delegates point to .NET methods so you can execute them indirectly. They’re managed types, so they’re fully type-safe. Delegates are different from C++ function pointers. Many function pointers in C++ require special treatment. For example, normal C++ member functions include a hidden first parameter called the this parameter, which is a pointer to the instance of the class for which it is declared. Static and global functions do not have this hidden pointer. .NET delegates can reference all kinds of methods of classes and objects: static, virtual, and instance methods. You’ll find delegates used mostly within the context of event handling and callbacks within .NET applications. Each instance of a delegate can forward a call to one or more methods with matching signatures. That is, delegates can be used to broadcast. You’ll see an example of a managed C++ delegate in Chapter 32.

Interfaces

Until the mid-1990s, nobody paid any attention to the discipline of interface-based programming. As you saw when we looked at COM, one of most important contributions by COM was that of the interface. Using interface-based programming, you can describe type compatibility between different implementations. For example, you might define a shape interface that includes several methods for describing shapes. You might then implement several different shapes using the shape interface—for example, a square, a circle, and a line. Each of these shapes behaves very differently. However, by abstracting the shape behavior behind an interface, client code that deals only with the interface can work with all the shapes. The shape interface denotes type compatibility. .NET fully supports interfaces. .NET interfaces primarily serve to provide type compatibility for objects.

You’ll see an example of using a managed interface in Chapter 32.

Classes

Classes within .NET are similar to classes you’ve worked with using C++. They have data members and methods. In .NET, data members are called the fields within a class. .NET classes can have both virtual and nonvirtual methods. Virtual methods work the way you’d expect them to—to ensure that the correct version of a function is called within a class hierarchy. .NET classes can also implement interfaces, just like C++ classes can. All code running within the common language runtime must somehow be scoped by a class.

.NET offers a bit more flexibility than C++ does as far as classes are concerned. .NET classes can be sealed at some point, and new classes can no longer be derived from them. Also, whole classes can be labeled as abstract, which means that new classes must be derived from them before they’re used. .NET enforces visibility constraints for both the members within a class and the class itself. .NET class members can be public, private, or protected. These visibility modifiers have exactly the same meaning in .NET as they do in C++. .NET class members can also be marked as being visible either within the assembly in which they live or outside that assembly.

You’ll see examples of .NET classes in the next three chapters.

Pointers

The final type available within .NET is the pointer type. The .NET runtime hides most of the details related to pointers, and you never have to see a regular address when you work in .NET. However, within the realm of managed C++, pointer types are available to you when you need them.

The three kinds of .NET pointers are managed pointers, unmanaged pointers, and unmanaged function pointers. When you work with managed code in the normal way (using C#, Visual Basic .NET, or managed C++), the common language runtime is working with managed pointers. For example, when reference types are passed as parameters or returned from methods, the common language runtime uses managed pointers. Only managed pointers are compliant with the Common Language Specification (CLS).

The common language runtime supports unmanaged pointers specifically to offer backward compatibility (with unmanaged C++). As a C++ developer, you’re used to unmanaged pointers—they’re just addresses in memory.

The most common use for pointers is for reading and writing raw data. When you’re using managed references and pointers, you don’t see the actual memory you’re working with. If you’re in a situation where you want to see raw memory, unmanaged pointers are the way to go.

The Common Language Specification

One of the greatest draws of .NET is the wide variety of syntaxes for expressing functionality within .NET applications. Official .NET languages coming from Microsoft include managed C++, C#, and Visual Basic .NET. However, other companies are producing .NET-compatible languages. There’s a version of PERL for .NET, and Fujitsu even has a COBOL compiler for .NET!

As you’ve seen, the .NET Framework defines a pervasive type system that permeates all executable code running under the common language runtime. Remember that one of the key goals of .NET is to provide a high degree of interoperability among components—no matter what languages they were written in. The common type system guarantees consistent data typing between components. The CLS guarantees that languages follow the CTS.

The CLS is a set of rules defining the behavior of externally visible items. These rules are necessary for software to interoperate within the common language runtime. Remember that the runtime wants to treat all data and code in the same way. Types that adhere to the CLS are completely interoperable. You can mark types as CLS-compliant using the System.CLSCompliantAttribute.

Assemblies

All right, that’s enough talk about types. The next question is: Where does all this wonderful common language runtime code live? Are there still DLLs in this new world? What do executables look like? DLLs and executables are still around in the .NET Framework. However, now they’re called assemblies and they contain Intermediate Language (IL)—not native code.

We looked at normal executables, normal DLLs, and COM DLLs earlier in this book. When we compiled that code, the compiler turned the source code directly into some native machine code. .NET executables and DLLs work a bit differently. They’re compiled into assemblies. Technically, an assembly is simply a collection of type definitions. Type definitions include all the examples we covered earlier—code encapsulated within classes, enumerations, user-defined types, and so forth. Assemblies can also contain resources, such as bitmaps, JPEG files, and resource files.

Classic Windows development draws a strong distinction between DLLs and EXEs. A .NET assembly can be either a DLL or an EXE. Assemblies are the fundamental unit of deployment and include code that the runtime executes. All .NET code executed by the runtime must live within an assembly. Assemblies have only one entry point: DllMain, WinMain, or Main.

Every type within a .NET application must appear in an assembly somewhere. It is denoted by both the name of the assembly and the name of the type. However, once you get down to working with a type within a development environment like managed C++, the development environment usually takes care of managing the assembly name.

The native .NET types we’ve discussed already (such as System.Object and System.ValueType) are contained within the System assembly. Because assemblies define the type boundary within .NET, a type within the scope of one assembly is not the same as a type loaded in the scope of another assembly—even if it shares the same name.

The assembly is the smallest versionable unit in the common language runtime. Assemblies include type information and a section called the manifest, which describes the version information and dependencies on other assemblies.

Built-in Type Information

Built-in type information was one of the most important contributions that COM made to Windows programming. This is also known as reflection. DLLs or executables that have type information included with them become self-describing, enabling both tools and runtime environments to know and understand the contents of the module. For example, as you fill out a COM function call into Visual C++’s edit window, IntelliSense immediately comes up, showing you the function signature. IntelliSense works because there’s type information included with the component. The MTS and COM+ runtimes use type information to manufacture proxy stubs on the fly.

When you’re programming COM using C++, the way to get type information into the executable or the DLL is to include some IDL with the project. The IDL is compiled into a binary type library, and the type library is attached to the module as a resource. .NET includes the same facility, but the type information is automatically included within the assembly. There’s no more need for an intermediate IDL file—when a .NET compiler compiles your code, it generates type information and adds it to the assembly.

The Manifest

In addition to the built-in type information, every assembly includes a section named the manifest. .NET assembly manifests can include dependencies on other assemblies, versioning information, and information relating to the culture and language for which the assembly was intended. An assembly’s manifest is like a top-level directory for the assembly.

Like type information, manifests are integral to .NET development. The information within the manifest tells the loader which assemblies to load when loading an application, which version of an assembly to load, and so on. Manifests are generated automatically—no intermediate steps are involved in creating a manifest.

By including the dependency information, .NET solves a long-standing issue with COM. With COM, there’s no easy way to figure out DLL dependencies. The Platform SDK includes a tool named Depends.exe that examines the import list of a DLL or EXE file to find out DLL dependencies. However, because COM exposes its functionality through interfaces (rather than standard DLL entry points) and because COM DLL loading information is mostly contained within the registry, there’s no way to easily deduce DLL dependencies. .NET manifests do include the dependencies of assemblies. Because the assembly includes dependency information, the common language runtime loader makes sure all required assemblies are loaded before executing the code within an assembly.

Private vs. Public Assemblies

In COM’s heyday, one of the most widely touted features of IUnknown was that it was supposed to enable component versioning. A dynamically evolving software project cannot be hardwired together. There must be some flexibility in the way the components connect. COM forces applications to ask their components for interfaces (rather than assuming the interfaces are there). When a new version of a component is dropped into an application (or perhaps an older version of a component is inadvertently installed), the application gets fair warning of the change. The problem with COM versioning is that despite the tremendous flexibility in how components are connected together, the versioning mechanism still fails from time to time. For example, if you install an old version of a component, clients expecting the new component will be mighty disappointed.

The main reason for this versioning failure is that COM components are visible to every application on the PC—they’re global in nature. That means that replacing a component affects all the applications that depend on the component, in a ripple effect. All COM components are referenced in the registry—and the registry is available to all applications. The common language runtime solves this problem by distinguishing between public and private assemblies.

The .NET component model prefers private assemblies. When you confine functionality to a specific component and make it visible only to the client that needs it, you get rid of the ripple effect when you replace the component. Only clients of that particular assembly are affected. One main goal of .NET is to make deploying an application as easy as picking up the contents of a directory and using a copying mechanism (such as XCOPY or FTP) to move the contents to a new directory or machine. Because COM components rely so heavily on the registry, installing and uninstalling components is a major issue.

.NET component versioning works through an established directory structure. The directory containing the application is referred to as the AppBase directory of that application. The process of finding an assembly is called probing. The runtime performs several steps to locate an assembly. It first looks in the AppBase directory and then in a subdirectory under AppBase with the same name as the assembly, checking within the culture subdirectory if it does not find the assembly immediately. The runtime searches for DLLs first and EXEs second. It stops searching after it finds the first match. .NET provides a good amount of flexibility when probing—you can modify the probe mechanism by modifying the application’s configuration file (an XML file accompanying the application that is used for tweaking your application).

.NET also includes provisions for sharing components between applications. Shared components are installed in the global assembly cache (GAC). The GAC is a special directory on your machine that holds shared assemblies. The GAC can hold multiple versions of the same DLL, thereby solving the versioning problem.

In COM, you name components uniquely using GUIDs.When you ask for a component via its GUID, you’ll get the most current version of the component. In .NET, components are named uniquely through strong naming.

A common language runtime assembly name consists of four parts: a simple text name, a version number, culture information, and a strong name. A strong name is based on a pair of keys—one public and one private. The unique name of an assembly is the conjunction of the text name and the public key. You’ll see an example of signing an assembly in Chapter 32.

.NET Versioning

As you just saw, .NET prefers private components to public components. However, sharing a component is sometimes essential. When you share code, versioning is very important. COM didn’t quite get it right. Rather than hoping that the latest version of a component is available on a machine, .NET allows multiple versions of a single component to reside on the same machine. Naturally, this arrangement implies some form of versioning. .NET assemblies deployed in the GAC require version information in the manifest. This is simple enough—you just make sure the correct attributes are applied in the source code. Assembly references used by client code contain the version number of the assembly that the client expects to see. You’ll see this when we look at some assemblies using a tool named ILDASM in Chapter 32. The runtime uses version numbers when binding to shared assemblies. Rather than hoping that a DLL is compatible by name, .NET builds the version number into the name of the DLL. Clients latch onto a specific DLL by binding to a specific version number.

Living Within the Common Language Runtime

We spent the first part of this book looking at how to write native-code Windows applications. Programming native-code applications offers performance advantages and flexibility. However, along with the freedom and flexibility comes a great deal of programmer responsibility and hygiene when it comes to resource management and type safety. Writing code to run under the common language runtime relieves you of many of the responsibilities normally associated with native-code programming. For example, the common language runtime takes care of programming responsibilities from array-bound checking to managing memory, avoiding thread deadlocks, and securing components programmatically. This is a benefit that Visual Basic developers have enjoyed for years. Now the convenience of having your code managed for you is available to C++ developers as well.

Intermediate Language and Just-in-Time Compiling

The traditional Windows-based applications we’ve been building throughout this book compile down to native Intel code and run right on the chip. .NET and the common language runtime work a bit differently—.NET assemblies compile down to IL. The common language runtime’s execution engine (Mscoree.dll) compiles the IL into machine code immediately before its execution in a process known as just-in-time (JIT) compiling. It adds one more layer of indirection between the human-created source code and the chip the code is to run on. This layer of indirection carries many advantages with it.

One of the primary advantages of using IL is that multiple syntaxes can be used for writing .NET code. As long as the compiler can turn source code into IL, it does not matter which programming language or environment you use. In this book, we’re using managed C++. However, many .NET languages are available: C# and Visual Basic .NET from Microsoft and even a version of COBOL.

Another advantage of using IL is type safety. How many times have you chased down pointer bugs, array indexing bugs, or parameter-passing bugs because of mismatched data types or incorrect type casting? It happens less in C++, but this sort of bug ran rampant in older C-style coding. If you use IL between the source code and the final native executable, the runtime can verify the code within an assembly during the final JIT compilation down to machine code. The common language runtime verifies the code to make sure that it does not do anything dangerous such as accessing memory directly. Adding IL between the source code and the final native code allows a higher degree of protection than having pure native-code applications around.

The final advantage of using IL is that it inherently decouples your EXEs and DLLs from the operating system and hardware platform. When an EXE or DLL consists of intermediate code (not native code), it is truly platform-independent. Right now, Microsoft has a version of the common language runtime that runs on Windows 2000, Windows NT, and Windows 98. IL allows for the possibility of deploying the runtime on other platforms that are not running Windows or not running Intel processors.

.NET Garbage Collection

Living under the common language runtime means that code doesn’t have to look after itself. Developers who use native C++ must track their resources vigilantly in order to not cause leaks. .NET developers don’t have to pay attention to that—.NET uses garbage collection.

You can find more comprehensive discussions of .NET garbage collection out there, including Applied Microsoft .NET Framework Programming by Jeffrey Richter (Microsoft Press, 2002). However, I’ll give you a rundown of how memory lives within the common language runtime.

As a C++ developer, you’re aware of how a program allocates and manages memory because you’re the one doing it. You allocate an object using the new operator and delete it when you’re done with it. You’re probably also aware of some of the other kinds of memory used within your applications—those kinds of memory taken up by global and static variables. Finally, many programs have local variables that live for a short time on the stack. .NET applications also use all these types of memory allocation.

The difference with .NET is that the common language runtime keeps track of all these resource allocations. All the memory allocation types mentioned earlier are referred to as an application’s roots. The common language garbage collector watches all these memory allocations and determines when they’re no longer referenced. When memory is no longer referenced, it’s collected. This greatly simplifies programming.

One advantage of IL is that the JIT compiler knows about these references to the application’s roots. The JIT compiler builds a list of root references and maintains it (with the help of the common language runtime) as the program executes. When the garbage collector has to figure out what memory is no longer referenced, the list of roots is the starting point.

While the program is running, garbage collection can occur in several situations: when an allocation fails, during calls to the GC.Collect method, and at otherwise regular intervals. When a garbage collection occurs, the common language runtime suspends all threads within the process during specific safe points (a location in the executable code where the runtime can safely suspend a thread), frees unreferenced objects, and collapses the managed heap.

While the threads are suspended, the garbage collector starts with application roots and walks the object graphs within the system, figuring out which objects are referenced and which are not. The runtime garbage collector is efficient and smart enough to detect cyclical references using internal lists that track references.

After figuring out which objects can be removed, the garbage collector moves nongarbage objects to the bottom of the heap to make room at the top. This makes subsequent memory allocations very fast because the top of the runtime heap is always clear. By contrast, the C++ memory manager often creates a fragmented heap while allocating and deallocating blocks of varying sizes.

The runtime then resumes the threads, and they’re returned to the original calling program. The garbage collector updates any references to nongarbage objects if they’ve been moved. The application will be unaware of any relocations once the threads resume. For the most part, it’s very hard to detect when garbage collection happens.

Most of this memory allocation and deallocation happens behind the scenes, and you don’t have to worry too much about it. Even if you deeply nest references, the garbage collector will take good care of you and you can live a carefree existence as far as memory allocation is concerned.

Finalization

In C++, we’re used to placing clean-up code within a destructor because we basically know that an object will be destroyed when it’s no longer needed—the programmer is responsible for deleting objects. However, in .NET the garbage collector is responsible for getting rid of objects—and it often does so on its own schedule. You don’t know when (or sometimes even if) an object will be freed. So instead of destructors, the common language runtime supports finalizers. If an object needs to be notified before it’s being collected, a class can override the virtual Finalize method (which is inherited from System.Object). When the collector classifies an object as garbage, the runtime invokes the object’s Finalize method before moving the memory back to the heap.

The garbage collector has been tightly tuned by Microsoft. When the garbage collector is left to its own devices, you’ll barely notice anything when garbage is collected. However, if you end up overriding Finalize too often, you’ll impede the garbage collector. Whenever the garbage collector finds an object with Finalize, it records the reference for consultation during collection, thereby slowing the allocation. The garbage collector has to check the finalization list and wait until Finalize is called to release the memory, thereby slowing collections. Remember that you need to override Finalize only when an object holds on to unmanaged resources. The common language runtime will manage nested references to managed objects for you. Finalization is really there to help classes manage non-.NET resources such as file references or other unmanaged resources.

Threading and the Common Language Runtime

Preemptive threading has been around since the earliest versions of Windows NT. Of course, the common language runtime would be an incomplete platform if it were missing the preemptive multitasking feature. Threading in the common language runtime is more straightforward than when you use the raw API. The common language runtime includes types for starting, stopping, and suspending threads.

AppDomains

The basic execution and resource boundary is the process space. Processes maintain their own heaps and other resources, and Windows processes define a security and execution boundary. The process space still exists for applications running under the common language runtime. However, process spaces can be further divided into AppDomains, which also serves as a security and execution boundary.

AppDomains are like logical process spaces within a real process space. Assemblies serve as the logical (rather than physical) deployment model. A physical process can host separate logical AppDomains to form separate fault-tolerance boundaries within a single process. That way, it can protect parts of your application from each other (for example, if you don’t completely trust a component). An AppDomain gives you many of the same advantages that you get when you put your code into a separate process, but without the overhead of a process. Figure 31-3 shows several common language runtime components distributed between two AppDomains within a single process.

Figure 31-3. Common language runtime components distributed between multiple AppDomains.

Interoperability

One hard lesson we’ve all learned is the importance of backward compatibility and being able to link to older (“legacy”) code bases. In fact, Windows owes much of its own success to backward compatibility. When people invest lots of money into applications, they’re not going to simply toss them away just because a new operating system is available. Windows has always fully supported older versions of applications. Keeping the old code running is very important—just ask any COO or CTO. Companies are not going to rewrite all their code just because of .NET. Often the most critical part of an application is a very old component that nobody’s touched for years. So getting new code to work with older code is an extremely important feature of .NET.

.NET provides three basic mechanisms to facilitate interoperability between new code and old code: platform invoke (P/Invoke) , COM-callable wrappers for calling from COM code to common language runtime code, and runtime-callable wrappers for calling from the runtime to COM components.

Platform Invoke

As you’ve seen, client applications need a way to load library code dynamically and get to the entry points. In Windows, these functions are LoadLibrary and GetProcAddress. If you find yourself needing to call entry points within a specific legacy DLL, P/Invoke is the way to go.

To use P/Invoke, you prototype functions within your managed code and mark them using the DllImport attribute. When the code compiles to an assembly, the functions will be understood to be living in an external DLL. The common language runtime will call LoadLibrary/GetProcAddress automatically. Using the DllImport attribute, you can specify the calling convention, you can alias the method so it has a different name from the real DLL function within your program, and you can control the character set that the function uses.

COM Interop: TLBIMP and TLBEXP

Of course, much of the code out there is COM code, so it’s important to be able to call back and forth between COM code and common language runtime code. .NET provides facilities for both situations: calling a legacy COM class from the common language runtime and calling a common language runtime class from some existing COM code. The .NET Framework provides two utilities to accommodate these situations: the Type Library Importer (Tlbimp.exe) and the Type Library Exporter (Tlbexp.exe.). The Type Library Importer reads a COM type library, emits common language runtime metadata, and creates a runtime-callable wrapper. The Type Library Exporter reads common language runtime metadata and creates a type library and a COM-callable wrapper. These utilities are fairly straightforward to use.

博客介绍

正文

The Common Language Runtime2005-12-08 11:05:00

评论