dd86k's blog

Machine code enthusiast

Native .NET Application Debugging

Author: dd
Published: January 18, 2020
Last modified: December 25, 2022 at 15h02
Categories:

Here’s something interesting about debugging with .NET I noticed about.

So, if you launch a .NET executable (tested in Windows, .NET 4.5 Desktop) in a machine-native debugger, you’ll get the the usual machine breakpoint (INT3, 0xCC). Then the next exception will have its usual memory location, but an odd OS code: 0x04242420.

This happens with any .NET applications, pointing to one memory location (32-bit): 77ABF146. Which in turn gives the following machine code: 8B 4C 24 54 (MOV ECX, [ESP+84]).

All exceptions invoked by MOV are faults (GP, SS, NP, PF, AC, and UD), which could be corrected but requires error handling, and, well, in any case, I don’t do in any of my old, old .NET apps. In fact this happened in every .NET application I tried, resulting with the same opcode and memory location.

So we could say “well it could happen in (on the assumption that it happens in) mscorlib.dll” and close the case right? Not yet.

Windows is outputting a different OS exception code than usual. Usually, that’d be EXCEPTION_ACCESS_VIOLATION or EXCEPTION_ILLEGAL_INSTRUCTION, but this is really a CLRDBG_NOTIFICATION_EXCEPTION_CODE (0x04242420, .NET Desktop 4.0+).

If this is a CLR debugging notification, that means, well, maybe you guessed it, Common Instruction Language instructions are being ran by the virtual machine.

Identifying the CIL Opcode

According to Common Language Infrastructure (CLI) Partition III CIL Instruction Set (With Added Microsoft Specific Implementation Notes) (2006, Fourth Edition, MS Partition III.pdf), the 0x8B opcode signifies conv.ovf.u.un: “Convert unsigned to a native unsigned int (on the stack as native int) and throw an exception on overflow.”. Following with this description:

Convert the value on top of the stack to the type specified in the opcode, and leave that converted value on the top of the stack. If the value cannot be represented, an exception is thrown. The item on the top of the stack is treated as an unsigned value before the conversion.

Conversions from floating-point numbers to integral values truncate the number toward zero. Note that integer values of less than 4 bytes are extended to int32 (not native int) on the evaluation stack.

The acceptable operand types and their corresponding result data type are encapsulated in Table 8: Conversion Operations.

To summarize: conv (convert) ovf (with overflow detection) u (unsigned) to un (unsigned native). Basically, this takes an CIL unsigned int value and converts it to a native unsigned int value. The only operand is an unsigned 32-bit value.

Looking at the conversion table, an int32 (Input value) to a native int (Convert-To value), converted as a sign-extended number (and stops GC tracking).

Let’s see, which possible exception may it throw? System.OverflowException (only!) “is thrown if the result cannot be represented in the result type”.

Identifying the Exception

Upon the received exception, three ExceptionInformation fields (from DEBUG_EVENT.Exception.ExceptionRecord.ExceptionInformation[0,1,2]) were populated with 0x31415927, 0x6B2E0000, and 0x00F3F438 (in an 64-bit operation mode, these values are 64-bit, at the exception for the first field, a 32-bit constant).

After looking at all Partition manuals, I could not find the meaning behind the meaning of these values nor the OS-specific implementation.

Looking online leads me to the CoreCLR project (.NET Core): utils.cpp (https://github.com/dotnet/coreclr/blob/master/src/debug/shared/utils.cpp).

The first field (0x31415927) seems to be called a cookie, a constant hash meant to be verified against CLRDBG_EXCEPTION_DATA_CHECKSUM and its value may change between .NET versions. I assume this to be implementation specific and not OS-dependent. When invoked, this field is filled with CLRDBG_EXCEPTION_DATA_CHECKSUM, and verified by the debugger if it’s equal.

The second field (0x6B2E0000), when FEATURE_DBGIPC_TRANSPORT_VM and FEATURE_DBGIPC_TRANSPORT_DI are not defined, seems to indicate the target base: “Base address of mscorwks. This identifies the instance of the CLR“. This value is filled with (ULONG_PTR)CORDB_ADDRESS_TO_PTR(pClrBaseAddress).

The last, third field (0x00F3F438), seems to indicate the payload, the “real” debug event pointer (“Target Address of DebuggerIPCEvent, which contains the “real” event.“), which is filled only if the cookie and the target base are valid and were verified, which otherwise NULL is returned. This value is filled with (ULONG_PTR)pIPCEvent.

Dealing with the Exception

So, DebuggerIPCEvent, you might ask, how is it structured?

Upon searching CoreCLR further with CLRDBG_EXCEPTION_DATA_CHECKSUM in mind, I found the raw dispatcher at DebugEventSource.cpp (https://github.com/dotnet/corert/blob/master/src/Native/Runtime/DebugEventSource.cpp). The SendRawEvent function receives a DebugEventPayload pointer and directly puts it in the third field (as UInt64) — Bingo! It also serves as raising the exception (with RaiseException) with the CLR notification code and the three arguments.

Searching for DebugEventPayload leads me to DebugEvents.h (https://github.com/dotnet/corert/blob/master/src/Native/Runtime/inc/DebugEvents.h), which we clearly see the definition of the structure, at least for CoreCLR so I can’t say the same for .NET Desktop.

This is the entire header file as per the time of publication.

// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

// -----------------------------------------------------------------------------------------------------------
// This defines the payload of debug events that are emited by Redhawk runtime and
// received by the debugger. These payloads are referenced by 1st chance SEH exceptions


// -----------------------------------------------------------------------------------------------------------
// This version of holder does not have a default constructor.
#ifndef __DEBUG_EVENTS_H_
#define __DEBUG_EVENTS_H_

// Special Exception code for RH to communicate to debugger
// RH will raise this exception to communicate managed debug events.
// Exception codes can't use bit 0x10000000, that's reserved by OS.
// NOTE: This is intentionally different than CLR's exception code (0x04242420)
// Perhaps it is because now we are in building 40? Who would know
#define CLRDBG_NOTIFICATION_EXCEPTION_CODE  ((int) 0x04040400)

// This is exception argument 0 included in debugger notification events. 
// The debugger uses this as a sanity check.
// This could be very volatile data that changes between builds.
// NOTE: Again intentionally different than CLR's checksum (0x31415927)
//       It doesn't have to be, but if anyone is manually looking at these
//       exception payloads I am trying to make it obvious that they aren't
//       the same.
#define CLRDBG_EXCEPTION_DATA_CHECKSUM ((int) 0x27182818)

typedef enum 
{
    DEBUG_EVENT_TYPE_INVALID = 0,
    DEBUG_EVENT_TYPE_LOAD_MODULE = 1,
    DEBUG_EVENT_TYPE_UNLOAD_MODULE = 2,
    DEBUG_EVENT_TYPE_EXCEPTION_THROWN = 3,
    DEBUG_EVENT_TYPE_EXCEPTION_FIRST_PASS_FRAME_ENTER = 4,
    DEBUG_EVENT_TYPE_EXCEPTION_CATCH_HANDLER_FOUND = 5,
    DEBUG_EVENT_TYPE_EXCEPTION_UNHANDLED = 6,
    DEBUG_EVENT_TYPE_CUSTOM = 7,
    DEBUG_EVENT_TYPE_MAX = 8
} DebugEventType;

typedef unsigned int ULONG32;

struct DebugEventPayload
{
    DebugEventType type;
    union
    {
        struct 
        {
            CORDB_ADDRESS pModuleHeader; //ModuleHeader*
        } ModuleLoadUnload;
        struct
        {
            CORDB_ADDRESS ip;
            CORDB_ADDRESS sp;
        } Exception;
        struct
        {
            CORDB_ADDRESS payload;
            ULONG32 length;
        } Custom;
    };
};


#endif // __DEBUG_EVENTS_H_

We see the cookie (CLRDBG_EXCEPTION_DATA_CHECKSUM) value and the exception (CLRDBG_NOTIFICATION_EXCEPTION_CODE) code value. Oddly enough, we see the CoreCLR’s CLRDBG_NOTIFICATION_EXCEPTION_CODE of 0x04040400 differs from .NET Desktop’s 0x04242420. Perhaps indicating that the two runtimes’ debugging internal mechanism are very different.

And at last, the payload’s structure.

Dealing with DebugEventPayload

From that, I am guessing the following exception types correspond to the following structure. When an exception of type DEBUG_EVENT_TYPE_INVALID (0) occurs, I assume the runtime handles the exception in a particular way, thus why it’s not in the following table.

Structure Exception Type Exception Value
ModuleLoadUnload DEBUG_EVENT_TYPE_LOAD_MODULE 1
DEBUG_EVENT_TYPE_UNLOAD_MODULE 2
Exception DEBUG_EVENT_TYPE_EXCEPTION_THROWN 3
DEBUG_EVENT_TYPE_EXCEPTION_FIRST_PASS_FRAME_ENTER 4
DEBUG_EVENT_TYPE_EXCEPTION_CATCH_HANDLER_FOUND 5
DEBUG_EVENT_TYPE_EXCEPTION_UNHANDLED 6
Custom DEBUG_EVENT_TYPE_CUSTOM 7
DEBUG_EVENT_TYPE_MAX 8

Using ReadProcessMemory with the payload address, I managed to obtain the value 3128 (0xC38) using a 4-byte buffer… On .NET Desktop’s CLR, which was constant across applications I tested, mind you.

From there, all remains a mystery for CLR, which is a bummer, but it was fun nonetheless. Well, if I ever manage to go further, I’ll update this post, or make a part 2. That, and/or with CoreCLR. Or make a post about debugging Python, even. Who knows?