Show HN: I wrote a minimal memory allocator in C

(github.com)

83 points | by t9nzin 9 hours ago

8 comments

canyp 3 hours ago
I always like me some memory allocator blog/code. Two links in the context of gamedev below, in case you or anyone else is interested.
https://screwjankgames.github.io/engine%20programming/2020/0...
https://www.bytesbeneath.com/p/the-arena-custom-memory-alloc...
I also don't know how much we want to butcher this blog post, but:
> RAM is fundamentally a giant array of bytes, where each byte has a unique address. However, CPUs don’t fetch data one byte at a time. They read and write memory in fixed-size chunks called words which are typically 4 bytes on 32-bit systems or 8 bytes on 64-bit systems.
CPUs these days fetch entire cache lines. Memory is also split into banks. There are many more details involved, and it is viewing memory as a giant array of bytes that is fundamentally broken. It's a useful abstraction up until some point, but it breaks apart once you analyze performance. This part of the blog didn't seem very accurate.
achierius 7 hours ago
Looks nice! Though I have to say, you should probably avoid sbreak even for small allocations -- obviously it's slow, but even beyond that you have to deal with the fact that it's essentially a global singleton and introduces a lot of subtle failure cases you might not think of + which you can't really solve anyways. It's better to mmap out some chunk of memory and sub-allocate it out yourself.
[-]
- macintux 6 hours ago
  Can you supply an example of a failure case that can’t be solved (or is at least challenging to solve)?
  [-]
  - sweetjuly 5 hours ago
    sbrk grows linearly, and if anything is mapped in the way it fails. mmap can map anywhere there's space as it is not restricted to linear mappings. So, you'd better hope a mapping doesn't randomly land there and run you out of space.
    It's not a failure but relatedly as sbrk is linear, you also don't really have a reasonable way to deal with fragmentation. For example, suppose you allocate 1000 page sized objects and then free all but the last one. With an mmap based heap, you can free all 999 other pages back to the OS whereas with sbrk you're stuck with those 999 pages you don't need for the lifetime of that 1000th object (better hope it's not long lived!).
    Really, sbrk only exists for legacy reasons.
    [-]
    - ori_b 4 hours ago
      > With an mmap based heap, you can free all 999 other pages back to the OS whereas with sbrk you're stuck with those 999 pages you don't need for the lifetime of that 1000th object (better hope it's not long lived!).
      Thanks to the wonders of virtual memory, you can madvise(MADV_DONTNEED), and return the memory to the OS, without giving up the address space.
      [-]
      - squirrellous 4 hours ago
        Not giving up the address space feels like an anti feature. This would mean, among other things, that access to the DONTNEED memory is no longer a segfault but garbage values instead, which is not ideal.
AdieuToLogic 5 hours ago
Why redeclare the function signatures in allocator.h[0] when they must match what is already defined by the C standard?
Since this is all allocator.h[0] contains aside from other include statements, why have allocator.h at all?
0 - https://github.com/t9nzin/memory/blob/main/include/allocator...
[-]
- leecommamichael 3 hours ago
  Why write a mini allocator?
checker659 5 hours ago
That project structure is reminding me of claude.
[-]
- gameman144 3 hours ago
  Could you elaborate? The project structure looks extremely normal to me, but I don't know if I'm overlooking red flags all over the place.
  [-]
  - checker659 2 hours ago
    The structure in the README.md (not the actual structure).
- leecommamichael 3 hours ago
  Personally I’d not bother with folders, but to each their own. I’m sorry but I just don’t see what you’re onto.
- keyle 5 hours ago
  So does half the readme
  [-]
  - leecommamichael 3 hours ago
    Which part?
rurban 2 hours ago
One line: bump sbrk(). Done.
No need to free in short living processes
[-]
- matheusmoreira 1 minute ago
  That's also the fastest garbage collection strategy. Just keep allocating new objects and never collect and free the old ones!
  Perfectly usable in many applications. Unfortunately, since it depends on your application, it's ill suited for a general purpose library.
quibono 7 hours ago
I hate that very often my first reaction to Show HN posts like this is to cynically look for signs of blatant AI code use.
I don't think that's the case here though.
Subsentient 4 hours ago
As soon as I saw mmap(), I knew this wasn't a true native allocator. So yeah, not quite so insightful after all.
[-]
- matheusmoreira 3 minutes ago
  How so? All production memory allocators use mmap nowadays. What were you expecting?
- leecommamichael 3 hours ago
  Your comment reads as if you believe that simply writing the syscall C wrapper yourself would constitute a meaningful enhancement to the code. We can all apply more effort to be insightful.
- vintagedave 4 hours ago
  It’s a ‘minimal’ allocator. Reading the blog it seems to be going in depth into allocator principles, in practice, things like coalescing blocks.
  I haven’t read in full so not sure if it discusses using blocks vs other structures (eg stack-based allocators, stack being the data structure not the program stack.) Ie, it’s a set of implementation choices. It still seems to reflect common ways of allocating in far more detail than many blogs I’ve read on the topic do.