Wednesday, June 15, 2016

Grokking the CLI - Part 2: Meta Meanderings


This is Part 2 in a series on Understanding the Metadata within the .NET Common Language Infrastructure, aka the ECMA-335.

Part 1 can be found here.

Five Streams to Rule them All

Last time we went over how to get to the hints that pointed at where the data is located.  We had to understand the MS DOS Header, PE Image, its Data Dictionary, just to get a pointer to the real data.

Now that we have that, we can start peeling back the layers that make up the metadata.

The Relative Virtual Address we received at the end of our last foray points us to a MetadataRoot structure, which outlines the Metadata Streams.  Once you've read in the MetadataRoot, you'll then read in a series of StreamHeaders, which point to the individual streams (all prefixed with a pound #, which I'm omitting for this post).   Those streams are as follows:
  • Strings:The strings which make up the names of things, such as the names of methods, properties, and so on.
  • Blob:Binary blobs which make up constants, type specifications, method signatures, member reference signatures, and so on.
  • US:User Strings - When you use string constants in code (ex. Console.WriteLine("TEST");), the strings end up here.
  • Guid:The Guids that get generated by the .NET compilers end up here.  You probably don't know they exist.  The only metadata table that refers to this is the Module table.
  • ~ or -:The Metadata Table Stream, perhaps the most complex and 'varied' stream of them all.  As of writing this, there are 38 tables supported in .NET Metadata.


In .NET all streams are 'compressed' in some way, the Strings stream is no exception.  One of the simplest ways they accomplish 'compression' in this stream is by Right Side Similarity.  If you have Two strings in your assembly: MyItem, and Item.  This will be saved as a single string of MyItem, suffixed with a null character (0x00.)

The metadata tables that refer to this string table, give indexes within the string table's data.  Those indexes would point to the place where the string starts, and the null character would say when it's finished.

So if the string MyItem was at a relative location of 0, Item would be at a relative location of 2.
Here's the basic approach to reading the stream that I used. To make things simpler on me, I wrote a 'Substream' that allows relative indexes to be easier to calculate. It makes it so I don't need to worry about adding on the section's offset each time I read things.
private unsafe bool ReadSubstring(uint index)
    lock (this.syncObject)
        reader.BaseStream.Position = index;
        /* *
         * To save space, smaller strings that exist as the 
         * tail end of another string, are condensed accordingly.
         * *
         * It's quicker to construct the strings from the 
         * original source than it is to iterate through the
         * location table used to quickly look items up.
         * */
        uint loc = index;
        while (loc < base.Size)
            byte current = reader.ReadByte();
            if (current == 0)
        uint size = loc - index;
        reader.BaseStream.Position = index;
        byte[] result = new byte[size];

        for (int i = 0; i < size; i++)
            result[i] = reader.ReadByte();
        this.AddSubstring(ConvertUTF8ByteArray(result), index);
        return loc < base.Size;


I would say that the Blob stream increases the difficulty a little bit. Blobs start with their lengths as a compressed int (details in User Strings below) With this, we get a multitude of signatures, I would go over all of them here, but a great article already exists for this. I've provided an excerpt from the blob SignatureParser below, to indicate that it's similar to normal parsing, with exception to the fact that it's parsing binary instead of text:
internal static ICliMetadataMethodSignature ParseMethodSignature(EndianAwareBinaryReader reader, CliMetadataFixedRoot metadataRoot, bool canHaveRefContext = true)

    const CliMetadataMethodSigFlags legalFlags                =
            CliMetadataMethodSigFlags.HasThis                 |
            CliMetadataMethodSigFlags.ExplicitThis            ;

    const CliMetadataMethodSigConventions legalConventions    =
            CliMetadataMethodSigConventions.Default           |
            CliMetadataMethodSigConventions.VariableArguments |
            CliMetadataMethodSigConventions.Generic           |
            CliMetadataMethodSigConventions.StdCall           |
            CliMetadataMethodSigConventions.Cdecl             ;

    const int legalFirst  = (int)legalFlags                   |
                            (int)legalConventions             ;
    byte firstByte        = reader.ReadByte();
    if ((firstByte & legalFirst) == 0 && firstByte != 0)
        throw new BadImageFormatException("Unknown calling convention encountered.");
    var callingConvention = ((CliMetadataMethodSigConventions)firstByte) & legalConventions;
    var flags = ((CliMetadataMethodSigFlags)firstByte) & legalFlags;

    int paramCount;
    int genericParamCount = 0;
    if ((callingConvention & CliMetadataMethodSigConventions.Generic) == CliMetadataMethodSigConventions.Generic)
        genericParamCount = CliMetadataFixedRoot.ReadCompressedUnsignedInt(reader);
    paramCount = CliMetadataFixedRoot.ReadCompressedUnsignedInt(reader);
    ICliMetadataReturnTypeSignature returnType = ParseReturnTypeSignature(reader, metadataRoot);
    bool sentinelEncountered = false;
    if (canHaveRefContext)
        ICliMetadataVarArgParamSignature[] parameters = new ICliMetadataVarArgParamSignature[paramCount];
        for (int i = 0; i < parameters.Length; i++)
            byte nextByte = (byte)(reader.PeekByte() & 0xFF);
            if (nextByte == (byte)CliMetadataMethodSigFlags.Sentinel)
                if (!sentinelEncountered)
                    flags |= CliMetadataMethodSigFlags.Sentinel;
                    sentinelEncountered = true;
            parameters[i] = (ICliMetadataVarArgParamSignature)ParseParam(reader, metadataRoot, true, sentinelEncountered);
        return new CliMetadataMethodRefSignature(callingConvention, flags, returnType, parameters);
        ICliMetadataParamSignature[] parameters = new ICliMetadataParamSignature[paramCount];
        for (int i = 0; i < parameters.Length; i++)
            parameters[i] = ParseParam(reader, metadataRoot);
        return new CliMetadataMethodDefSignature(callingConvention, flags, returnType, parameters);
There are a few oddities in the blob signatures, the StandAloneSig table, for instance, usually only defines local variable types, that is the description of what type was used for the local variables in methods. An exceptions was found by me in 2012 when I started writing the CLI metadata parser: Field Signatures. Had I been aware enough to know, I would've found that the two mentioned in the first answer also stumbled across this a few years prior. Please note, the answer in the first linked MSDN post links to a blog that no longer exists, the second MSDN link points to their discovery, but with less detail. Turns out field signatures in the StandAloneSig table are there to support debugging the constants you specify in your code, but only when compiling in Debug mode.
The irony here is the constants from excerpt above were what clued me to the out of place Field signatures.
Blobs also contain details about the constants used, and custom attributes. I haven't personally gotten to the constants because as of yet I haven't needed to.

US - User Strings

User Strings are pretty self explanatory. They're the stream that method bodies refer to when loading string constants to work with them. This behaves much more like the Blob stream than the normal Strings stream. All user strings start with a compressed int, which is read like so:
public static int ReadCompressedUnsignedInt(EndianAwareBinaryReader reader, out byte bytesUsed)
    byte compressedFirstByte           = reader.ReadByte();
    const int sevenBitMask             = 0x7F;
    const int fourteenBitmask          = 0xBF;
    const int twentyNineBitMask        = 0xDF;
    bytesUsed                          = 1;
    int decompressedResult             = 0;

    if ((compressedFirstByte & sevenBitMask) == compressedFirstByte)
        decompressedResult             = compressedFirstByte;
    else if ((compressedFirstByte & fourteenBitmask) == compressedFirstByte)
        byte hiByte                    = (byte)(compressedFirstByte & 0x3F);
        byte loByte                    = reader.ReadByte();
        decompressedResult             = loByte | hiByte << 8;
        bytesUsed                      = 2;
    else if ((compressedFirstByte & twentyNineBitMask) == compressedFirstByte)
        byte hiWordHiByte              = (byte)(compressedFirstByte & 0x1F);
        byte hiWordLoByte              = reader.ReadByte();
        byte loWordHiByte              = reader.ReadByte();
        byte loWordLoByte              = reader.ReadByte();
        decompressedResult             = loWordLoByte | loWordHiByte << 8 | hiWordLoByte << 16 | hiWordHiByte << 24;
        bytesUsed                      = 4;
    return decompressedResult;
In the case of User Strings, all byte-counts are odd. The last byte contains a zero (0) or one (1) depending on whether or not the string needs special processing beyond UTF-8 processing (Partition ][, 24.2.4 of the ECMA-335 spec). As I was writing this post, I realized I ignore this point entirely! Looks like I have a ToDo entry in my future.


The Guid stream is the simplest, but that doesn't mean its less important. The Guids provided in this area are used to distinguish the Modules that result from a build from one another. Interesting to note, Roslyn is allowing a deterministic build approach which will guarantee the MVID (Module Version Identifier) will be identical from build to build if you enable deterministic builds.

To Be Continued...

The Table Stream (#~ or #-) will require a post all of its own. Stay tuned for the next installment.

No comments: