If the function is equivalent to a no-op, and not explicitly marked as volatile for side-effects, it absolutely can elide it. If there is a side-effect in hardware or wider systems like the OS, then it must be marked as volatile. If the code is just code, then a function call that does effectively nothing, will probably become nothing.
That was one of the first optimisations we had, back with Fortran and COBOL. Before C existed - and as B started life as a stripped down Fortran compiler, the history carried through.
The K&R book describes the buddy system for malloc, and how its design makes it suitable for compiler optimisations - including ignoring a write to a pointer that does nothing, because the pointer will no longer be valid.
You are literally scaring me now. I'd understand such things being done when statically linking or running JIT, but for "normal" program which function implementation malloc() will link against is not known during compilation. How can compiler go, like, "eh, I'll assume free(malloc(x)) is NOP and drop it" and not break most existing code?
> but for "normal" program which function implementation malloc() will link against is not known during compilation. How can compiler go, like, "eh, I'll assume free(malloc(x)) is NOP and drop it" and not break most existing code?
I'd suspect that eliding suitable malloc/free pairs would not break most existing code because most existing code simply does not depend on malloc/free doing anything other than and/or beyond what the C standard requires.
How would you propose that eliding free(malloc(x)) would break "most" existing code, anyways?
As an example, user kentonv wrote: "I patched the memory allocator used by the Cloudflare Workers runtime to overwrite all memory with a static byte pattern on free". And compiler would, like, "nah, let's leave all that data on stack".
Or somebody would try to plug in mimalloc/jemalloc or a debug allocator and wonder what's going on.
>As an example, user kentonv wrote: "I patched the memory allocator used by the Cloudflare Workers runtime to overwrite all memory with a static byte pattern on free". And compiler would, like, "nah, let's leave all that data on stack".
Such a program would continue to function as normal; the dirty data would just be left on the stack. If the developer wants to clear that data too, they'd just have to modify the compiler to overwrite the stack just before (or just after) moving the stack pointer.
>Or somebody would try to plug in mimalloc/jemalloc or a debug allocator and wonder what's going on.
Again, that wouldn't be broken. They would see that no dynamic allocations were performed during that particular section. Which would be correct.
I'm a bit skeptical either example is representative of "most" existing software. If anything, the mere existence of __builtin_malloc and its default use should hint that most existing software doesn't care about malloc/free actually being called. That being said...
> As an example, user kentonv wrote: "I patched the memory allocator used by the Cloudflare Workers runtime to overwrite all memory with a static byte pattern on free". And compiler would, like, "nah, let's leave all that data on stack".
Strictly speaking, I don't think eliding malloc/free would "break" those programs because that behavior is there for security if/when something else goes wrong, not as part of the software's regular intended functionality (or at least I sure hope nothing relies on that behavior for proper functioning!).
> Or somebody would try to plug in mimalloc/jemalloc [] and wonder what's going on.
Why would mimalloc/jemalloc/some other general-purpose allocator care that it doesn't have to execute a matching malloc/free pair any more than the default allocator?
I'm not sure debug allocators would care either? If you're trying to debug mismatched malloc/free pairs then the ones the compiler elides are the ones you don't care about anyways since those are the ones that can be statically proven to be "self-contained" and/or correct. If you're gathering statistics then you probably care more about the malloc/free calls that do occur (i.e., the ones that can't be elided), not those that don't.
In any case, if you want to use a malloc/free implementation that promises more than the C standard does (e.g., special byte pattern on free, statistics/debug info tracking, etc.) there's always -fno-builtin-malloc (or memset_explicit if you're lucky enough to be using C23). Of course, the tradeoff is that you give up some potential performance.
Thank you for putting it in a much more correct and understandable language than I could. That is exactly what I am talking about: if you call __builtin_malloc (e.g. via macro definition in the libc header), compiler is free to do whatever it wants. However, calling "malloc" library function should call "malloc" library function, and anything else is unacceptable and a bug. There should be no case where compiler could assume anything about a function it does not see based simply on it's name. Neither malloc nor strlen.
> That is exactly what I am talking about: if you call __builtin_malloc (e.g. via macro definition in the libc header), compiler is free to do whatever it wants. However, calling "malloc" library function should call "malloc" library function, and anything else is unacceptable and a bug.
I think that's an overly narrow reading of the footnote. I don't see an obvious reason why "such names" in the footnote should only cover "some macro names beginning with an underscore" and not also "external identifiers". And if implementations are allowed to define special semantics for "external identifiers", then... well, that's exactly what they did!
In addition, there's still the as-if rule. The semantics of malloc/free are defined by the C standard; if the compiler can deduce that there is no observable difference between a version of the program that calls those and a version that does not, why does it matter that the call is emitted? A function call in and of itself is not a side effect, and since the C standard dictates what malloc/free do the compiler knows their possible side effects.
Furthermore, the addition of memset_explicit and its footnote ("The intention is that the memory store is always performed (i.e. never elided), regardless of optimizations. This is in
contrast to calls to the memset function (7.26.6.1)") implies that eliding calls is in fact acceptable behavior when optimizations are enabled. If eliding calls were not permissible when optimizing then what's the point of memset_explicit?
> There should be no case where compiler could assume anything about a function it does not see based simply on it's name.
Again, external identifiers defined by the C standard are reserved. Reserved external identifiers aren't just for show. From the C89 standard:
> If the program defines an external identifier with the same name as a reserved external identifier, even in a semantically equivalent form, the behavior is undefined.
And from C23:
> If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), the behavior is undefined.
This means that yes, under modern compilers' interpretation of UB compilers can assume things about functions based on their names because modern compilers generally optimize assuming UB does not happen. The compiler does not need to see the function's implementation because it is the function's implementation as far as it is concerned.
Ah yes, N2625 "What we think we reserve". Basically any C program containing variable or function "top", "END", "strict", "member" and so on is non-conforming and subject to undefined behaviour, so they define "potentially reserved" identifiers and as usual compiler vendors go and do the sane right thing.
That was one of the first optimisations we had, back with Fortran and COBOL. Before C existed - and as B started life as a stripped down Fortran compiler, the history carried through.
The K&R book describes the buddy system for malloc, and how its design makes it suitable for compiler optimisations - including ignoring a write to a pointer that does nothing, because the pointer will no longer be valid.