And who says that faster runtime trumps all other considerations? I would much rather have my computer run slower than have to continuously deal with security vulnerabilities. My time is much more valuable than CPU time.
> I would much rather have my computer run slower than have to continuously deal with security vulnerabilities. My time is much more valuable than CPU time.
To be devil’s advocate and take the other side of this argument: I would rather the CPU I paid for spend its cycles performing actual application logic. Every cycle spent executing all this “safety code” overhead feels like me having to pay for a developer’s sloppiness.
I feel end users pay a high cost for all this runtime checking and all these frameworks and levels of abstraction.
First, not all language safety features have to manifest themselves as run-time checks. A properly designed language will allow many safety checks to be done at compile-time.
And second, would you really rather deal with security holes on an on-going basis?
The problem with C is not that you can write unsafe code in it, it is that the design of the language -- and pointer aliasing in particular -- makes it impossible to perform even basic sanity checks at compile time [EDIT: or even at run-time for that matter]. For example:
int x[10];
...
int y = x[11];
In any sane language, that would be a trivially-checkable compile-time error. But in C it is not because there are all kinds of things that could legally happen where the ellipses are that would make this code not violate any of C's safety constraints. That is the problem with C, and it is a fundamental problem with C's design, that it conflates pointers and arrays. C was designed for a totally different world, and it is broken beyond repair.
I'm not here to defend C, but to point out a problem that many advocates make when they cherry pick examples of how their language is both safer and more performant than C. Simply put, that example is irrelevant when the size of the array is not known at compile time. The moment that you have a dynamically allocated array, you are either dropping safety (e.g. expecting the developer to perform bounds checks when necessary) or performance (e.g. the compiler inserts bounds checks at runtime).
It is also worth considering that there is nothing preventing C compilers from inserting compile time checks to address examples such as yours. I just tried to compile a similar example in gcc and it does catch the unsafe code with warnings and the optimizer turned on.
> The moment that you have a dynamically allocated array, you are either dropping safety (e.g. expecting the developer to perform bounds checks when necessary) or performance (e.g. the compiler inserts bounds checks at runtime).
Yes, that's true. So? The original claim is that it is self-evident that this tradeoff should be resolved in favor of performance in the design of the language, and it just isn't (self-evident). If anything, it is self-evident to me that the tradeoff should be resolved in favor of safety in today's world.
There are patterns, though, that can help. C compilers are capable of recognising and optimising many forms of iteration, but being able to tell the compiler explicitly that you're iterating over a collection gives you that much more confidence that the compiler will do the right thing.
Especially when to get the compiler to do the right thing safely you need to add manual bounds checking with the expectation that the compiler will optimise it away, but without any mechanism to ensure that it actually happens.
It depends greatly on the problem at hand, but there are definitely cases where even with a dynamic array size we can unroll our loop such that we check bounds less than once per iteration.
Bad example. There's nothing[0] you could put in the ellipsis to make that code valid, and both gcc and clang will warn about it (clang on defaults, gcc with -Wall).
[0] Ok, I guess you could #define x something, but that's not interesting from a static analysis perspective.
Warning is one thing, but crashing is better. That's possible to do in a C compiler too of course, because in this example the array hasn't decayed to a pointer and its size can be recovered.
The issue is when you pass the array to another function, it can't track the size without changing the ABI.
If the choice is between a compile time warning and a runtime crash, I will take the warning every single time: much closer to the actual error. You're probably asking for a compile time error instead.
Indeed the `-Werror` option is often a good thing to have (though I don't set it by default on my free software projects, because other people might use other compilers with different warnings, and I don't want to block them outright).
-Werror is an interesting case -- it's an example of a key difference between C and Rust.
Rust's compiler will reject programs unless it can prove them to be valid. C compilers will accept programs unless they can prove them to be invalid. But then C warnings can lead to an indeterminate state: code that looks iffy may be rejected, but we've not necessarily proven that the code is wrong. We're still trusting the programmers' claim that code which may exhibit undefined behaviour with certain inputs won't ever receive those inputs.
I meant a crash. Obviously both at once is best, but you can detect the possibility of the crash at compile time (disassemble your program and see the bounds check), and it turns a possible security issue into a predictable crash so that’s safer.
I don’t really love forcing errors; when a program is “under construction” you should be able to act like it is and not have to clean up all the incomplete parts. It also annoys people testing new compilers against your code.
> Indeed the `-Werror` option is often a good thing to have (though I don't set it by default on my free software projects, because other people might use other compilers with different warnings, and I don't want to block them outright).
Another problem with C. There's way too much implementation-dependent behavior.
Not with Monocypher. That project has one implementation defined behaviour (right shifts of negative integers), and it's one where all platforms all behave exactly the same (they propagate the sign bit). In over 5 years, I haven't got a single report of a platform behaving differently (which would result in public key crypto not working at all).
However I do get spurious warnings, such as mixing arithmetic and bitwise operations even in cases where that's intended. Pleasing every compiler is not trivial.
What do you suppose you could put in the ellipsis that would make your second statement defined behavior other than preprocesor stuff or creating a new scope with a different variable named x? (Neither of which would frustrate a compiler wanting to give you a warning)
The real overhead is not safety code but inefficient design and unnecessary (and often detrimental to the user) features. The added 10% of safety checks is nothing compared to orders of magnitude of bloat modern software has. Moreover, a lot of C codebases have the same safety checks manually implemented into them, sacrificing performance in order to have less bugs. And when it comes to designing a modern C alternative, it's not just a performance/safety tradeoff, the are parts of C that are simply bad, like null-terminated strings, and are long due to be replaced with something better.
I'm suspicious that people arguing that safety checks making things slow are basing that on the very stale now idea that CPU's work sequentially. Which isn't true for super scalar machines at all. And also forgetting compilers will optimize away a lot of them.
There is an asymmetrical risk here. On one hand the compile might emit checks that aren't needed. On the other that the programmer will fail to insert a check that is needed.
Both yours and parent arguments are strong and reasonable. Compared with other overhead that developers deliberately add to their programs, language safety checks are probably a drop in the bucket. We're in a world where developers think it's reasonable to run a chat app on top of an entire browser framework on top of the platform SDK on top of the OS. The runtime performance of checking an array's bounds are the least of our concerns.
>I would rather the CPU I paid for spend its cycles performing actual application logic.
The problem is that for CPU programming errors are application logic too and they are run, cf. protected memory. You don't run everything in ring0 do you?
That's not the point. The point the parent is making is that yes, languages that include safety features are slower, but you will actually need those safety features in C as well, in the code. So your program will also be slower in C.
There is something backwards about how safety checks are done in C.
First it's up to the programmer to put them in.
Then hopefully the compiler will optimize away the unneeded ones. If the programmer though biffs and forgets then oops. And of course classically we blame the programmer not the system he's been forced to work under.
That seems like a reasonable thing in 1982. Which is 40 years ago.
It would be better if the compiler implemented the checks automagically and removed ones it knows it doesn't need. And bonus, if the programmer puts one in, leave it alone.
Yes, I think we're actually all in agreement here. The common point is that TFA's argument that we need to keep C because it's faster is invalid. The GP's point is that it is not in fact faster because of all the things you need to do to produce code that can actually be deployed in today's world. My point is that even if C were faster (which it actually isn't) that would not matter because C is unsafe even with all the extra stuff that people stick onto it to try to make it safe. So the original claim is really a judgement call that speed matters more than safety, and this is not something about which there is any kind of consensus.
What does slower actually mean? Slower in the face of safety, while computers continue to get faster and or cheaper (depending on how you spend your transistors/cost) is a fools paradise. Don't be baited into their flawed logic.
If your transistor/cost curve has a doubling time of 3 years, 156 weeks. A 5% difference is approximately 11 weeks.
If your transistor/cost curve has a doubling time of 2 years, a 5% difference is approximately 8 weeks.
How fast do we have to get before safety is table stakes? Focusing on raw unsafe speed wouldn't be a normalized metric in any other industry. I'll spend 8-11 weeks of performance gains on correctness.
If you remove the mechanically preventable bugs from being a consideration, by definition the only one you now need to focus on are the ones NOT prevented by mechanism.
How is this not a win? We only have so many decisions we can make per day.
So, let's take the most extreme example of pursuing safety, WUFFS.
Unlike general purpose languages WUFFS has a very specific purpose (Wrangling Untrusted File Formats Safely) so it doesn't need to worry that it can't solve some of your problems, which frees it to completely refuse to do stuff that's unsafe, while going real, real fast.
WUFFS gets to completely omit bounds checks, which you probably wouldn't have dared try in C because it's so obviously unsafe, but WUFFS already proved at compile time that it can't ever have bounds misses so it needn't do these checks. WUFFS gets to also omit integer overflow checks, because again it proved at compile time that your code is correct and cannot overflow for any input. And since WUFFS knows how big the data is at all times it gets to automatically generate loop unrolling and suchlike accelerations.
WUFFS isn't a general purpose language, it doesn't have strings, it doesn't have growable arrays (what C++ calls "vectors"), it doesn't have any dynamic memory allocation, but then we weren't talking about how much you love general purpose features, we were talking about safety preventing security bugs. Which is exactly what WUFFS does.
I can't respond to that unless you are more specific about what are "the ones that lead to really bad outcomes". There are a lot of arbitrary-code-execution attacks that are enabled by buffer overflows, and IMHO that's as bad as it gets. There is absolutely no legitimate excuse for a buffer overflow in today's world. Switching to safe(r) language won't prevent all attacks, but it will make a big dent.
It defeats some extremely important classes of exploits. And I'm not sure how they're not ones that lead to really bad outcomes since they lead to fun ones such as arbitrary code execution all the time. I can create a C program with hideous vulnerabilities in about five minutes without doing anything that isn't totally standard and normal (albeit obviously vulnerable so technically buggy). I'd have to actually look up how to make my code vulnerable in languages with more safety features.