As Raymond Chen would say, this API was designed with "kernel-colored glasses". ...

As Raymond Chen would say, this API was designed with "kernel-colored glasses".

The StartTrace() is designed to notify the kernel that it should hook different locations to begin recording events. Since all kernel objects have to be installed in the namespace (just like creating a file in Linux for your terminal or in plan9 for network sockets) you must specify a name; tracing cannot be anonymous.

As with processes having the CreateProcess() and OpenProcessHandle() function pair, there is -- consistently -- a StartTrace() and OpenTrace() function pair. The OpenTrace() takes a name so it can look up the trace in the namespace.

StartTrace() could -- and should have the ability to -- fail. Kernel tracing is a privileged function, which is why it requires Admin privileges to start. By having OpenTrace() separate from StartTrace() you can have the StartTrace() function called from a Windows Service running with elevated privileges on behalf of a user-mode process, and then the user-mode process can call OpenTrace() and still be able to read the events. In fact, if the user-mode process knows that a trace is already running it doesn't even need to call StartTrace(). This applies to non-kernel traces as well as kernel traces (remember -- the name of the trace can be looked up in the namespace).

Requiring a callback -- from a kernel perspective -- also makes sense. Imagine what would happen if the user-mode process was responsible for reading events. The kernel is busy pumping events out into the buffer and, just then, the user-mode program gets hung waiting for network I/O. What happens? If the buffer never drains, maybe the kernel gets hung waiting for space (system deadlock). Or maybe the processes "misses" events, and the developer thinks the API is broken. Or better yet, what happens when multiple processes A and B are reading from the buffer and one of them (process A) pauses for a GC? Does this mean that process B gets blocked waiting for A to read events? Or does the process B "lose" events because A doesn't consume them?

The intent for having a callback on a dedicated read-thread is supposed to be like the OS handling a processor interrupt: do the minimum amount of work to record the interrupt details and then return to the kernel. In this case, the callback should copy the data into a process-specific buffer and notify a different thread that data is reading (classic producer-consumer / publish-subscribe behavior). Anything more than that and you threaten the stability of the system.

The overall architecture is both a workable design and consistent with all other Windows APIs. The problem is that the API is not user-friendly (the packed structure is required so that all required data can be passed to the kernel, and the kernel does not need to reference user-mode memory -- but still, packed structures are no fun!) and there are too many fields in the ETW structures such that users will be easily confused. Fortunately, however, most of the fields are zero.