Imposing memory security in C [video]

To me, "memory safety" really means:

- There are clearly stated guarantees, like "a pointer cannot access outside the bounds of its allocation" and "if you free an object while there are still pointers to it then those pointers cannot be dereferenced". These guarantees should be something you can reason about formally, and they should be falsifiable. Not sure this presentation really has that. It's not clear what they prevent, and what they don't prevent.

- There is no way to break out of the clearly stated guarantees. Totally unclear that whatever guarantees they have are actually guarded against in all cases. For example, what if a tmp_alloc'd object pointer escapes into another tmp_alloc'd object with different lifetime - I get that they wouldn't write code that does that intentionally, but "memory safety" to me means that if you did write that code, you'd either get a compile error or a runtime error.

It's possible to ascribe clearly stated guarantees to C and to make it impossible to break out of them (Fil-C and CHERI both achieve that).

johnnyjeans a day ago

> There is no way to break out of the clearly stated guarantees.
I disagree on this, and having escape hatches is critically important. Are we really going to call Rust or Haskell memory unsafe because they offer ways to break their safety guarantees?
- pizlonator a day ago
  
  I think that Rust's clearly stated guarantee holds if you never say "unsafe".
  That's still a clear statement, because it's trivial to tell if you used "unsafe" or not.
  - johnnyjeans a day ago
    
    Maybe I just misinterpret what you mean. When you say "no way" and "all cases", I take your meaning literally. The existence of pointers to bypass the borrow checker, disabling runtime bounds checks and unsafe blocks are exactly that: escape hatches to break Rust's safety, in the same way type-casting is an escape hatch to break C's (anemic) type safety, and unsafePerformIO in Haskell is an escape hatch to break every bone in your body.
    
    pizlonator a day ago
    
    That’s fair.
    FWIW, Fil-C’s guarantees are literally what you want. There’s no escape.
  - the__alchemist a day ago
    
    Trivial is not a word I would use here! Rust's `unsafe` gets fuzzy as you transverse an operation's dependencies! There are many applications where marking a function as `unsafe` is subjective.
    
    pizlonator a day ago
    
    True.
    I do think the ideal kind of memory safe language either has no "unsafe", or has an "unsafe" feature that only needs to be used in super rare an obscure cases (Java is like that, sort of).
    Fil-C has no "unsafe", so in that sense Fil-C is safer than Rust. You don't need an escape hatch if the memory safety guarantees are dialed in just right.
    
    titzer a day ago
    
    In Virgil, native targets have a couple of unsafe operations available, one of which is to be able to forge a closure from a pair of a code pointer and an object reference. This is used, e.g. to implement Wizard's JIT and fast interpreter, which generate new machine code at runtime. I can't imagine the level of proof necessary to make that safe--and not just the safety of the generated machine code, but it often lacks bounds checks because it relies on running verified Wasm bytecode, which cannot go out of bounds. So the proof would have to include a complete proof of correctness for the code validation algorithm (i.e. part of Wizard itself).
    
    pizlonator a day ago
    
    I dream of a JIT API that lets you propose machine code that is checked using an abstract interpreter to ensure that you have adequate checks to stay within the host language's type system.
    Someday, man
    
    titzer a day ago
    
    This will kill baseline JIT performance.
    I am trying out a couple of new directions, e.g. generating more of the tiers from a more abstract description, constantly shrinking the amount of hand-written compiler/interpreter code. My hard requirement is the end result has to be pretty darn close to what I'd write by hand.
    One thing I am thinking about now is how to make more use of the implementation language's (e.g. Virgil) compiler to be able to paste together machine code templates gotten from writing in the implementation language. Think copy-and-patch compilation, but as language primitive. E.g. "please emit an inlined copy of the machine code for this function (first-class ref to said function) into memory here, under this ABI".
    
    pizlonator a day ago
    
    > This will kill baseline JIT performance.
    You say that and then you describe exactly what I would have used as a solution: copy and patch.
    Just have the checker check the templates that the baseline JIT is stitching together and then have a safe way to ask for the prechecked templates to be stitched together.
    Bunch of details in getting that right obviously, but it doesn’t seem impossible.
    
    titzer 11 hours ago
    
    We're not that far off from each other.
    > Just have the checker check the templates that the baseline JIT is stitching together
    Sure, from my second paragraph, the templates it's using could be opaque things it got from requesting the static compiler generate a template from a first class function ref (at compile time), in which case the verification has already been done.
    
    pizlonator 9 hours ago
    
    I think we're saying the same thing
    
    porridgeraisin a day ago
    
    I don't get this, can you explain?
    
    pizlonator a day ago
    
    Sure.
    If I told you that I have a snippet of machine code that:
    - obeys the ABI of your safe language (ie it has exactly the calling convention that safe language uses)
    - corresponds exactly to a function body whose signature is T->U (or whatever, different safe languages have different function type syntax)
    - obeys the language’s type system.
    Then you could run an abstract interpreter to check that the machine code follows that type system. Simple example: given the above claims, if we further assume that the host language impl puts argument one into register 5, and the first argument’s type is “pointer to an array of bytes”, and we know that arrays have a 64-bit length prefixed to the start, then the abstract interpreter would just need to check that any deref of register 5 is preceded by a bounds check on whatever was loaded at offset -8 from register 5. And so on, for every possible thing you can do in the language.
    Then the JIT would just have to make sure it puts checks in all of the places that the absint expects them. If the absint fails, then the machine code is rejected.
    
    amluto a day ago
    
    I think there’s a sense in which Rust is safer than Fil-C, though: Rust allows abstractions with little to do with memory safety but that still can’t be broken without 'unsafe'. So a struct called EvenNumber can fairly strongly guarantee that it contains an even number.
    But Fil-C objects (at least for now?) only seem to allow one single capability type, and that capability grants unrestricted read/write access to the object’s bytes.
    I wonder if one could build a handle system in Fil-C that would allow this to be extended. Or if a different variant of a Fil-C-like system could distinguish between pointers with different access levels to an object and could allow only the correct piece of trusted code to increase the permission of a pointer.
    
    pizlonator a day ago
    
    I’ve thought about adding such things to Fil-C but have held off on going there because it feels like a bridge too far.
    What I mean by that is: the memory safety issues of C are a total dumpster fire, while whether a number is even or not (and whether you can prove that) is maybe like icing on the dumpster fire. It just doesn’t matter by comparison.
    So I want to decisively fix the memory safety issues and not lose focus.
    
    hotjump a day ago
    
    The problem of C code migration to memory safe languages is that legacy C projects aim for extremely high performance. Garbage-collecting languages would also be safe in any situation, but I want to note that the recent tendency toward Rust derives from its type-system based approach that imposes very few runtimes checks such as bound checking. I myself hope something like F* gets more attraction in the industry.
    
    pjmlp 16 hours ago
    
    That is the usual myth, back when I initially learned C, hobby programmers in Assembly could easily outperform machine code generated by C compilers, that is why books like those from Mike Abrash exist.
    C got its performance fame thanks to optimizing compilers that abuse UB semantics.
    Microsoft team on .NET, especially the great Stephen Toub blog posts, has been showing off how much performance can be squizzed out of a managed language compiler toolchain when people actually care.
    Also lets not forget Apple only moved away from Object Pascal due to an internal team doing MPW initially as kind of submarine project, due to their UNIX roots, and still their focus was C++, not C.
    
    pizlonator a day ago
    
    I hear the perf claim a lot and yet most of the time when I write C/C++ code it's not because it's the most performant. It's either because I'm editing a codebase that's already in C/C++, or I'm using libraries whose best (or only) bindings are C/C++, or because I want to make syscalls (and safe languages don't expose those as nicely as C).
    
    rrrix1 a day ago
    
    For desktop, server, web, mobile, etc. This holds true. Not so much for embedded systems, anything with unnatractive memory capacity or processor performance. Rust is starting to make it's way in, but C and even assembly is still king AFAIK.
    
    pizlonator a day ago
    
    And yet there are embedded systems running JavaScript and Python.
    
    kv85s a day ago
    
    Are there any real-world, production systems based on JavaScript and Python?
    Toy systems, yes. Hey, I too, think CircuitPython is really neat. But I'm skeptical someone would base a PLC (or similar) on it.
    
    ckcheng a day ago
    
    Above was talking about embedded systems, and I really don’t know if the James Webb Space Telescope counts, but…
    The James Webb Space Telescope runs JavaScript, apparently [1].
    [1]: https://www.theverge.com/2022/8/18/23206110/james-webb-space...
    
    imtringued 12 hours ago
    
    Have you thought about building a C verification framework around Fil-C?
    Fil-C doesn't necessarily have to run in production. It just needs to catch the bugs, e.g. by making it easy to fuzz C code compiled via Fil-C.
    
    pizlonator 9 hours ago
    
    I think you should run Fil-C in production.
  - jeffrallen a day ago
    
    But the Rust ecosystem is littered with unsafe, so good luck getting the actual benefits of Rust. :(
    
    woodruffw a day ago
    
    The actual benefits seem manifest: there aren't nearly as many public reports of memory corruption (much less exploitable corruption) in Rust components, even when those components make extensive use of unsafety directly or transitively.
    (This seems like one of those "throw the baby out with the bathwater" cases that people relitigate around Rust -- there's ample empirical evidence that building safe abstractions around unsafe primitives works well.)
    
    pizlonator a day ago
    
    > The actual benefits seem manifest: there aren't nearly as many public reports of memory corruption (much less exploitable corruption) in Rust components, even when those components make extensive use of unsafety directly or transitively.
    Not sure the data is clean enough to draw meaningful conclusions because of confounding factors.
    The biggest confounding factor is that Rust is relatively new, code written in it is even newer, and folks who research vulns may not have applied the same level of anger to Rust as to C.
    That said, your point about "throwing the baby out with the bathwater" is well taken. I would expect that Rust has much fewer vulns than C/C++. My point is only that it's an unproven expectation.
    
    woodruffw a day ago
    
    > Not sure the data is clean enough to draw meaningful conclusions because of confounding factors.
    I'm thinking of things like the Windows user- and kernel-mode font parsers; these have a pretty long and steady public history of exploitation that seems to have mostly stopped with the Rust rewrite 1-2 years ago. I don't think that's because vuln researches have stopped looking at them!
    But yeah, I would like it if Google and Microsoft (among others) would put more hard data out there. I don't think of the Windows kernel teams as typically suffering from hype-driven development, so my abductive conclusion is that they have strong supporting data internally.
    Edit: here's a hard data source from Google, showing that Rust has contributed to a marked decline in memory unsafety in Android[1].
    [1]: https://security.googleblog.com/2022/12/memory-safe-language...
    
    pizlonator a day ago
    
    That's good data, but: is the reduction in vulns because Rust is safer, or is it because vuln researchers assume it's safer and so choose to look for vulns in C/C++ code because it's what they're familiar with?
    It's hard to say.
    
    simonask a day ago
    
    Wait, what? Unsafe in Rust _is_ the actual benefit. That is to say, what `unsafe` does is that it allows you to implement a fundamentally tricky thing and express its safety invariants in the type system and lifetime system, resulting in a safe API with no tricky parts.
    That's the whole point.
    There's tons of trivial unsafe in the Rust ecosystem, and a little bit of nontrivial unsafe, because crates.io is full of libraries doing interesting things (high-performance data structures, synchronization primitives, FFI bindings, etc.) while providing a safe API, so you can do all of that without writing any unsafe yourself.
    The point of Rust isn't that you can implement low-level data structures in safe code, but that you can use them without fear.
    
    pizlonator a day ago
    
    Saying that unsafe is a benefit is backwards, I think. Unsafe is the compromise Rust made to achieve its other goals, but of course Rust would be better if there was some way of doing things without unsafe.
    
    not2b a day ago
    
    Consider the split_at_mut function. It takes one mutable slice, and returns two mutable slices by chopping at a caller-specified point. It's in the standard library.
    The operation is completely safe, as after the call the original mutable slice is no longer live, but the borrow checker won't let you write such a function yourself unless you tag it as unsafe, so that's what the implementation must do.
    The same thing happens in the implementation of Vec: there is low-level code that is unsafe, used to provide safe abstractions.
    
    pizlonator a day ago
    
    That’s not a benefit of Rust. In other safe languages you could implement those things without writing a single bit of unsafe code. (This is true in Fil-C for example.)
    
    amluto a day ago
    
    Is that really fair? Neither Fil-C nor C have anything that particularly resembles & or &mut. If Rust had an &shared_mut style of reference, you could presumably split it without unsafe. For that matter, Rust does have various interior-mutable types, and you could have a shared reference to a slice of AtomicBool or whatever, and you can split that without any particular magic.
    
    pizlonator a day ago
    
    Fil-C doesn’t have those things because it doesn’t need them to achieve safety.
    
    Dylan16807 10 hours ago
    
    How does it make sure data races are safe?
    Rust can split up an array without unsafe if you use Rc. That has overhead, but is it more than Fil-C in this situation?
    
    pizlonator 9 hours ago
    
    Data races are memory safe in Fil-C.
    
    amluto 8 hours ago
    
    I think the disconnect here is that some people want their data races (a) not to result in memory safety violations; some people want them to also (b) not result in nonsensical control flow or code producing results that look impossible, and some people want (c) data races to simply not occur.
    You seem to like (a), Linus Torvalds seems to like (b), and Rust targets (c).
    
    pizlonator 7 hours ago
    
    Rust doesn’t target (c). That’s a hugely disingenuous claim.
    The truth is: Rust’s approach to memory safety only works if you also have no data races. This makes it a strictly inferior approach to memory safety, since programs sometimes do have to race (sometimes it’s just the best solution). So, Rust has data race prevention not because it’s a good idea but because the whole language falls apart without it.
    
    NobodyNada 3 hours ago
    
    > Rust has data race prevention not because it’s a good idea but because the whole language falls apart without it.
    In terms of the design goals and evolution of the Rust language, this is exactly backwards. Rust was originally conceived as a garbage-collected, green-threaded language designed for concurrent programming -- think something similar to Go but with a stronger type system and no data races. The ownership model was created, first as foremost, not for memory safety but to prevent data races; and, more broadly, to help programmers reason about the correctness of their code.
    Midway through development, as people started using the language & getting a feel for it, the language designers realized that the ownership model was powerful enough to express not just data race freedom but full memory safety without needing a GC (some writing from this stage in Rust's development: https://smallcultfollowing.com/babysteps/blog/2013/06/11/on-..., https://pcwalton.github.io/_posts/2013-06-02-removing-garbag...). So they removed the GC, and that decision positioned Rust where it is today: a nicer, memory-safe "C/C++ competitor", popular for systems code where the performance overhead or runtime complexity of a garbage collector is considered unacceptable.
    
    Dylan16807 6 hours ago
    
    Yes but how do you accomplish that? If you do nothing at all, pointers can tear. And allocation sizes too if they're stored inline.
    
    pizlonator 4 hours ago
    
    The capability is internally a pointer to a monotonic GC-allocated object.
    That capability pointer is always stored and loaded using monotonic 64-bit accesses.
    Therefore, in the worst case you'll get a pointer that is torn from its capability, and then you'll trap accessing that pointer. But you'll never get a corrupt capability.
    If you don't want the pointer to tear from its capability, just use `_Atomic`, `volatile`, or `std::atomic`. Then Fil-C uses lock-free shenanigans to make sure that the capability and pointer travel together and don't tear from one another.
    
    Dylan16807 3 hours ago
    
    Okay, in that case I do think one of the data-holding types in rust is a fair comparison, adding some runtime checks to let you be looser about ownership. Then you can write a function to split/share an array without a single bit of unsafe code.
    
    jeffrallen a day ago
    
    I don't see the interest in split_at_mut. I can get the same thing by reslicing in Go. And also the GC will do the job the borrow checker foists off onto the unlucky Rust programmer.
    Pfft, whatever. Rusters gonna rust, I guess.
    
    not2b a day ago
    
    I'm not a Ruster, I've spent my career writing a ton of C++. But I'm interested in Rust as an alternative because it doesn't need GC.
    But in this case, it's not just the memory safety I'm interested in, it's the data races. If we have multiple threads but can guarantee that any object either has only read-only references, or one mutable references and no readers, we don't have data race issues.
    
    pizlonator a day ago
    
    Great reason to use Fil-C
kstrauser a day ago

Thinking aloud, and this is probably a bad idea for reasons I haven’t thought of.
What if pointers were a combination of values, like a 32 bit “zone” plus a 32 bit “offset” (where 32/32 is probably really 28/36 or something that allows >4GB allocations, but let’s figure that out later). Then each malloc() could increment the zone number, or pick an unused one randomly, so that there’s enormous space between consecutive allocs and an address wouldn’t be reissued quickly. A dangling pointer would the point at an address that isn’t mapped at all until possibly 2^32 malloc()s later. It wouldn’t help with long-lived dangling pointers, but would catch accessing a pointer right after it was freed.
I guess, more generally, why are addresses reused before they absolutely must be?
- zyedidia a day ago
  
  It sounds like what you're describing is one-time allocation, and I think it's a good idea. There is some work on making practical allocators that work this way [1]. For long-running programs, the allocator will run out of virtual address space and then you need something to resolve that -- either you do some form of garbage collection or you compromise on safety and just start reusing memory. This also doesn't address spatial safety.
  [1]: https://www.usenix.org/system/files/sec21summer_wickman.pdf
  - naasking 8 hours ago
    
    > For long-running programs, the allocator will run out of virtual address space and then you need something to resolve that -- either you do some form of garbage collection or you compromise on safety and just start reusing memory
    Or you destroy the current process after you marshall the data that should survive into a newly forked process. Side benefit: this means you get live upgrade support for free, because what is a live upgrade but migrating state to a new process with updated code?
  - kstrauser a day ago
    
    Oh, nifty! I guarantee you anyone else discussing this has put more than my 5 minutes' worth of thought into it.
    Yeah, if you allow reuse then it wouldn't be a guarantee. I think it'd be closer to the effects of ASLR, where it's still possible to accidentally still break things, just vastly less likely.
- pizlonator a day ago
  
  That’s a way of achieving safety that has so many costs:
  - physical fragmentation (you won’t be able to put two live objects into the same page)
  - virtual fragmentation (there’s kernel memory cost to having huge reservations)
  - 32 bit size limit
  Fil-C achieves safety without any of those compromises.
  - kstrauser a day ago
    
    For sure. I'm under no illusion that it wouldn't be costly. What I'm trying to suss out is whether libc could hypothetically change to give better safety to existing compiled binaries.
    
    pizlonator a day ago
    
    Two things:
    - The costs of your solution really are prohibitive. Lots of stuff just won't run.
    - "Better" isn't good enough because attackers are good at finding the loopholes. You need a guarantee.
- layer8 a day ago
  
  This sounds similar to the 386 segmented memory model: https://en.wikipedia.org/wiki/X86_memory_segmentation#80386_...
  However, it was limited to 8192 simultaneous “allocations” (segments) per process (or per whatever unit the OS associates the local descriptor tables with).
- throwawaymaths a day ago
  
  you can do this easily with virtual memory, and IIRC Zig's general purpose allocator does under some circumstances (don't remember if its default or if it needs a flag).
blacksqr a day ago

There are and have been many techniques and projects for making C more memory-safe. The crucial question it always comes down to is what performance hit do you take using them?
That's why C has been on top for so long. Seat-of-the-pants hand-crafted C has always been the fastest high-level language.
- WalterBright a day ago
  
  C's memory safety could be drastically improved with the addition of bounds-checked arrays (which is an extension, and does not change existing code):
  https://www.digitalmars.com/articles/C-biggest-mistake.html
  25 years of experience with D has shown this to be a huge improvement.
  D also has references as an alternative to pointers. References cannot have arithmetic done on them. Hence, by replacing pointers with references, and with array bounds checking, the incidence of memory corruption is hugely reduced.
  - pizlonator a day ago
    
    > C's memory safety could be drastically improved with the addition of bounds-checked arrays (which is an extension, and does not change existing code):
    If you solved that problem then you'd still have a dumpster fire of memory safety issues from bad casts, use after free, etc
    
    WalterBright a day ago
    
    Array overflows is consistently the number one memory safety bug in shipped code, by a wide margin.
    
    pizlonator a day ago
    
    Citation needed.
    (I would have guessed similar to what you said, minus the "by a wide margin" bit.)
    
    WalterBright a day ago
    
    A few minutes of googling:
    https://cwe.mitre.org/top25/archive/2024/2024_cwe_top25.html
    https://runsafesecurity.com/blog/memory-safety-vulnerabiliti...
    
    pizlonator a day ago
    
    Neither of those support your claim that buffer overflows are the top issue by a wide margin.
    If you are saying that “most memory safety issues are bounds related” then I agree. I’m just disagreeing on the wide margin bit.
    
    WalterBright a day ago
    
    > Neither of those support your claim that buffer overflows are the top issue by a wide margin
    That's true, but I have seen that statistic more than once, and decided that I wasn't going to spend more time searching for it.
    If that's not good enough for you, so be it.
- WalterBright a day ago
  I found that C programs rarely evolve beyond their initial design. The trouble is, it's hard to refactor C programs. For example,
  struct S { int a; }; struct S s; s.a = 3; struct S *p; p->a = 3;
  I.e. a . is for direct access, -> for indirect access. Let's say you want to change passing S by value to passing S by pointer. Now you have to update every use, instead of just the declaration.
  This is how it would work in D:
  struct S { int a; } S s; s.a = 3; S* p; p.a = 3; ref S q; q.a = 3;
  And so refactoring becomes much easier, and so happens more often.
  - WalterBright a day ago
    
    > C has always been the fastest high-level language.
    C has another big speed problem. Strings are 0 terminated, rather than length terminated. This means constant scanning of strings to find their length. Even worse, the scanning of the string reloads the cache with the string contents, which is pretty bad for performance.
    Of course, you could use `struct String { char *p; size_t length; };` but since every library you want to connect to uses 0 terminated strings, you're out on your island all alone, so pragmatically it does not work.
    Another speed-destroying problem with C strings is you cannot take a substring that does not require allocating a new string and then copying the data. (Unless the substring is right-justified.) This is not fast in any universe.
    D uses length-denoted strings as a basic data type, and with string processing code, it is much faster than C. Substrings are quick and easy. You can still interface with C because D string literals implicitly convert to C string literals, as the literals are 0 terminated. So this works in D:
    printf("hello world!\n");
    (People sometimes rag on me for still using printf, but printf is the most optimized and debugged library function in the world, so I take advantage!)
    
    pizlonator a day ago
    
    Yeah totally.
    It's a perfect example of C being optimized for simple mapping onto linear memory, rather than some kind of performance optimum
    
    Dylan16807 9 hours ago
    
    I don't know, a pointer with a size is also quite simple.
- pizlonator a day ago
  
  > There are and have been many techniques and projects for making C more memory-safe.
  Sort of. None of them got all the way to safety, or they never got all the way to compatibility with C.
  Fil-C is novel in that it achieves both safety and compatibility.
  > The crucial question it always comes down to is what performance hit do you take using them?
  Is that really the crucial question?
  I don't think you would have even gotten to asking that question with most attempts to make C memory safe, because they involved experimental academic compilers that could only compile a subset of the language and only worked for a tiny corpus of benchmarks.
  Lots of C/C++ code is not written with a perf mindset. Most of the UNIX utilities are like that, for example.
  > That's why C has been on top for so long. Seat-of-the-pants hand-crafted C has always been the fastest high-level language.
  I don't think that's the reason. C rose to where it is today even when it was much slower than assembly. C was slower than FORTRAN for a long time (maybe still is?) but people preferred C over FORTRAN anyway.
  C's biggest superpower is how easy it makes it to talk to system ABI (syscalls, dynamic linking, etc).
  - blacksqr a day ago
    
    >> There are and have been many techniques and projects for making C more memory-safe.
    > Sort of.
    Yes. That's why I used the qualifier "more." Our statements are not in conflict.
    > Fil-C is novel in that it achieves both safety and compatibility.
    How does it affect performance?
    >> The crucial question it always comes down to is what performance hit do you take using them?
    > Is that really the crucial question?
    Yes, because it's the factor that industry leaders use to decide on which language to use. For example, Apple switching from Pascal to C way back in the Stone Age. The fact that it's the crucial question doesn't mean that lots of people don't consider other factors for their own reasons.
    > I don't think you would have even gotten to asking that question with most attempts to make C memory safe.
    Yes, most. But for example, Microsoft's Checked C comes with a performance penalty of almost 10% for a partial solution. Not academic. Very commercial.
    > C rose to where it is today even when it was much slower than assembly
    Yes, that's why I said "high-level language." I don't consider assembly high-level, do you?
    > people preferred C over FORTRAN anyway
    People preferred C in the 1970s/80s because at the time you could allocate memory dynamically in C but not in FORTRAN. FORTRAN fixed that in the 1990s, but by then there were too few FORTRAN programmers to compete. Since then C has serially defeated all newcomers. Maybe Go or Rust are poised to take it on. When a major operating system switches from C, we'll know.
    
    pizlonator a day ago
    
    > How does it affect performance?
    Right now, 1.5x-5x, but considering how many optimizations I know I can do but haven't done yet, I think those numbers are an upper bound.
    > Yes, because it's the factor that industry leaders use to decide on which language to use. For example, Apple switching from Pascal to C way back in the Stone Age. The fact that it's the crucial question doesn't mean that lots of people don't consider other factors for their own reasons.
    I don't think this is true at all, sorry. Top reason for using C/C++ is inertia. If you've got a pile of C code, then you'll keep writing C.
    > Yes, most. But for example, Microsoft's Checked C comes with a performance penalty of almost 10% for a partial solution.
    Checked C didn't make C memory safe, so I don't think it's interesting.
    > Yes, that's why I said "high-level language." I don't consider assembly high-level, do you?
    No, I don't consider assembly to be high-level. The point is: serious engineers don't just blindly reach for the fastest programming language. They'll take slow downs if it makes them more productive. Happens all the time.
    > People preferred C in the 1970s/80s because at the time you could allocate memory dynamically in C but not in FORTRAN. FORTRAN fixed that in the 1990s, but by then there were too few FORTRAN programmers to compete. Since then C has serially defeated all newcomers. Maybe Go or Rust are poised to take it on. When a major operating system switches from C, we'll know.
    The last time I saw a benchmark of FORTRAN beating C was the early 2000's. FORTRAN is much easier to optimize and compile.
    C is great for writing operating systems because C has the right abstractions, such as the abstractions necessary for doing dynamic linking and basically any kind of ABI compatibility. Rust and Go don't have that today. Even C++ is worse than C in this regard. Swift has ABI, but it took heroic efforts to get there.
    C didn't eat the world because of its stellar performance. It's the other way around. C has stellar performance because it ate the world, and then the industry had no choice but to make it fast.
PaulDavisThe1st a day ago

> There are clearly stated guarantees, like "a pointer cannot access outside the bounds of its allocation"
But that's not a pointer in anything like the sense of a C pointer.
You'd need to reword that (as I know you've been doing with FiL-C) to be something more like: no reference to a (variable|allocation|object) may ever be used to access memory that is not a part of the object.
Pointers are not that, and the work you've done in FiL-C to make them closer to that makes them also be "not pointers" in a classic sense.
I'm OK with that, it just needs to be more clear.
- pizlonator a day ago
  
  Semantics.
  You can call Fil-C’s pointers whatever you like. You can call them capabilities if that works better for you.
  The point of my post is to enumerate the set of things you’d need to do to pointers to make them safe. If that then means we’ve created something that you wouldn’t call a pointer then like whatever
  - tredre3 a day ago
    
    > Semantics.
    If your goal is just to redefine the word then by all means, continue.
    But semantics are very important if your goal is to drive adoption of your ideas. You can't misuse a term and then get pissy when people don't understand you.
    
    pizlonator a day ago
    
    Seems like bro understood me just fine.
    In Fil-C, pointers are called “pointers”.
  - PaulDavisThe1st a day ago
    
    And my point is that you cannot make C pointers safe. You can make something else that is safe, and you're clearly hard at work on that, which is great.
    
    pizlonator a day ago
    
    Fil-C's pointers work more like C pointers than like any other language construct I can think of, and are compatible enough that lots of C/C++ code compiles and runs with no changes.
    So I think that Fil-C pointers are just pointers.
    You could even get pedantic over what the spec says. If you go there, you find that Fil-C's pointers work exactly like how the spec promises pointers to work (and all of the places that the spec doesn't define either have safe semantics in Fil-C or lead to Fil-C safety errors).
    
    pjc50 a day ago
    
    What behavior in the C standard requires unsafety from pointers??
    
    pizlonator a day ago
    
    Not a big fan of this line of thinking since if you tried to deploy a C implementation that just implements what's in the C standard, then you'd quickly find that real world C code expects more of pointers than the C standard promises.
    Fil-C supports a superset of the C standard but a subset of what contemporary mainstream C compilers support (you can't pass an integer around in memory that is really being used to represent a pointer in Fil-C, but you can in Yolo-C).
    
    layer8 a day ago
    
    Does Fil-C support uintptr_t? Because I’ve written programs that I believe to be quite strictly conforming C99 programs, that make use of uintptr_t.
    
    pizlonator a day ago
    
    Yes you can use uintptr_t.
    You just can’t cast a pointer to uintptr_t, then store that int into memory, then load it back, then cast it back to pointer, and then dereference that pointer.
    
    layer8 a day ago
    
    Okay, but the C standard does allow exactly that for void*.
    
    pizlonator a day ago
    
    You can do exactly that with `void*` in Fil-C.
    I just don’t let integer types carry capabilities. If I did, things would get weird (like if you said `x + y` and both happened to carry capabilities then what would you get?)

beardyw a day ago

I remember chasing down a memory leak in my first commercial C code. Took me a long while to discover that if you allocate zero bytes you still have to free it! After that I took nothing for granted.

weinzierl a day ago

It's not even guaranteed that it doesn't allocate, so a malloc(0) could cause an out of memory.
- ignoramous a day ago
  
  > malloc(0) could cause an out of memory
  tbh, 640K RAM ought to be enough for anybody.
  - weinzierl a day ago
    
    For the last drop to make the cup run over it doesn't matter how big the cup is.

cryptonector a day ago

Nah. I use C a lot, but none of this is enough to make C safe. You really need the language and the tools to enforce discipline. Oh, and things like the cleanup attribute are not standard C either, so this is not portable code.

imtringued 12 hours ago

I didn't see anything new either.
What I would expect from C developers is this:
* Run all CI with UBSAN. Create versions of popular distributions that build every single package with UBSAN just to catch the bugs.
* Use design by contract patterns for pre and postconditions (library induced UB is popular in C++)
* Use model checking software like CBMC to statically guarantee the absence of UB and validity of the contracts
* Build a fuzzer for every method that cannot be formally verified
This is the bare minimum needed to keep C/C++ safe. The same applies to unsafe Rust by the way.
- cryptonector 7 hours ago
  
  I'd expect something like a super-C that provides a counted-byte string (still NUL-terminated for interoperability) type and support functions, `defer` or similar, `with`-like macros, etc., `mutable`/`immutable`. Such a thing could be like C++ was in the beginning: a front-end that translates to C99 or whatever standard is your lowest common denominator. You'd still have to do manual memory management, so you'd still have use-after-free issues, but they'd be a lot less common. Similarly you'd still have races, but a lot fewer.
throwawaymaths a day ago

usually portability in C includes the provision that you can drop in whatever #includes you want?
- cryptonector a day ago
  
  No, it's really not that simple at all.
  - throwawaymaths a day ago
    
    Probably depends on the macro, but ok.
    
    pjmlp 16 hours ago
    
    It starts by discovering how little most folks know from ISO C legalese versus what their compiler does, and it goes from there when adding anything else not part of the standard library.

debatem1 a day ago

I don't think anyone ever doubted that a C program could be memory safe. The problem is knowing without exhaustive work whether yours is one of them.

These aren't bad practices, but I don't think they satisfy that desire either.

SV_BubbleTime a day ago

I am in no way at all better than that guy. Not even sort of. I appreciate his talk.

However, if I were to make a presentation based on my superior C practices, it would have to be implementation and example heavy.

All of his rules sound great, except for when you have to break them or you don’t know how to do the things he’s talking about in your code, because you need to get something done today.

It reads a little like “I’ve learned a lot of lessons over my career, you should learn my lessons. You’re welcome.”

rrrix1 a day ago

The talk was obviously extremely time-limited, as demonstrated when they basically skipped the last handful of slides and then it abruptly ended. I think for the time allocated, it was just right, and they did include a couple of examples where it made sense.