Hacker Newsnew | past | comments | ask | show | jobs | submit | cbdevidal's commentslogin

Newer Kindle books published in the last few years require more than the Calibre plugin. Amazon is tightening the loop.

https://www.reddit.com/r/Calibre/comments/1q1uza4/successful...


I’ve heard a better idea.

“What you should in fact do is employ all the world's top male and female supermodels, pay them to walk the length of the train handing out free Chateau Petrus for the entire duration of the journey. You'll still have about 3 billion pounds left in change and people will ask for the trains to be slowed down.” ~Rory Sutherland


I’m a bit of an optimist. I think this will smack the hands of developers who don’t manage RAM well and future apps will necessarily be more memory-efficient.

> I think this will smack the hands of developers who don’t manage RAM well

And hopefully kill Electron.

I have never seen the point of spinning up a 300+Mb app just to display something that ought to need only 500Kb to paint onto the screen.


The point is being able to write it once with web developers instead of writing it a minimum of twice (Windows and macOS) with much harder to hire native UI developers.

And HTML/CSS/JS are far more powerful for designing than any of SwiftUI/IB on Apple, Jetpack/XML on Android, or WPF/WinUI on Windows, leaving aside that this is what designers, design platforms and AI models already work best with. Even if all the major OSes converged on one solution, it still wouldn't compete on ergonomics or declarative power for designing.

Lol SwiftUI/Jetpack/WPF aren’t design tools, they’re for writing native UI code. They’re simply not the right tool for building mockups.

I don’t see how design workflows matter in the conversation about cross-platform vs native and RAM efficiency since designers can always write their mockups in HTML/CSS/JS in isolation whenever they like and with any tool of their choice. You could even use purely GUI-based approaches like Figma or Sketch or any photo/vector editor, just tapping buttons and not writing a single line of web frontend code.


Who said anything about mockups? Design goes all the way from concept to real-world. If a designer can specify declaratively how that will look, feel, and animate, that's far better than a developer taking a mockup and trying their hardest to approximate some storyboards. Even as a developer working against mockups, I can move much faster with HTML/CSS than I can with native, and I'm well experienced at both (yes, that includes every tech I mentioned). With native, I either have to compromise on the vision, or I have to spend a long time fighting the system to make it happen (...and even then)

well, then you are really bad at native and should not be comparing those technologies despite your claims otherwise (which make little sense).

> really bad at native

Yikes. I spent 15 years developing native on both mobile and desktop. If you think that native has the same design flexibility as HTML/CSS, you're objectively wrong.

By design, each operation system limits you to their particular design language, and styling of components is hidden by the API making forward-compatible customisation impossible. There's no escaping that. And if you acknowledge that fact, you can't then claim native has the same design flexibility as HTML/CSS. If you don't acknowledge that fact, you're unhinged from reality.

There's pros and cons to the two approaches, of course. But that's not what's being debated here.


The real disconnect is that the user doesn't really care all that much. It's mostly the designers who care. And Qt for example but also WPF let you style components almost to unrecognizable and unusable results. So if everyone will need to make do with 8GB for the foreseeable future, designers might just be told "No.", which admittedly will be a big shock to some of them. Or maybe someone finally figures out how to do HTML+CSS in a couple of megabytes.

> the user doesn't really care all that much

They do. But not in the way that you think.

I recently switched from Spotify (well known Electron-based app) to Apple Music (well known native app). The move was mostly an ethical one, but I must say, the UI functionality and app features are basically poverty in comparison. One tiny example, navigating from playlist entry to artist requires multiple interactions. This is just one of many frustrations I've had with the app. But hey, it has beautiful liquid glass effects!

In short: iteration time matters. Times from design to implementation, to internal review, to real user feedback, and back to design from each phase should be as fast as possible. You don't get the same velocity as you do in native. Add to that you have to design and implement in quadruplicate, iOS design for iOS, Android for Android, MacOS for Mac, Windows design for windows. All that is why people use Electon.


There is native to the OS and there's native to the machine.

Anyways, I'm both cases you don't really have to write it twice.

Native to the OS: write only the UI twice, but implement the Core in Rust.

Native to the machine: Write it only once, e.g. in iced, and compile it for every Plattform.


You mean the point is to dump it all on the end user's machine, hogging its resources.

It's bad enough having to run one boated browser, now we have to run multiples?

This is not the right path.


As the kids say: skill issue!

The point is you can be lazy and write the app in html and js. Then you dont need to write c, even though c syntax is similar to js syntax and most gui apps wont require needing advanced c features if the gui framework is generous enough.

Now that everyone who cant be bothered, vibe codes, and electron apps are the overevangelized norm… People will probably not even worry about writing js and electron will be here to stay. The only way out is to evangelize something else.

Like how half the websites have giant in your face cookie banners and half have minimalist banners. The experience will still suck for the end user because the dev doesnt care and neither do the business leaders.


Syntax ain't the problem. The semantics of C and JS could not be more different.

But the point isn’t that they’re more different than alike. The point is that learning c is not really that hard it’s just that corporations don’t want you building apps with a stack they don’t control.

If a js dev really wanted to it wouldn’t be a huge uphill climb to code a c app because the syntax and concepts are similar enough.


Honestly C and JavaScript could hardly be more different, as languages.

About the only thing they share is curly braces.


Yeah JS is closer to lisp/scheme than C (I say this as someone who writes JS, Clojure and the occasional C).

What "advanced features" are there to speak of in C? What does the syntax of C being similar to JS matter?

This comment makes no sense.


Well theres the whole c89 vs c99. I’ll let you figure the rest out since it’s a puzzle in your perspective.

It's happening. Cursor 3 moved to rust. A lot of people are using Zed (rust) instead of vscode.

It won't be "happening" until Slack, Teams, and Discord leave Electron behind. They are the apps that need to be open 24/7.

It's not entirely clear what the connection is.

We're not doing Electron because some popular software also using it. We're doing Electron because the ability to create truly cross-platform interfaces with the web stack is more important to us than 300 MB of user memory.


> web stack is more important to us than 300 MB of user memory.

May I never have to use or work on your project's software.


> We're doing Electron because the ability to create truly cross-platform interfaces with the web stack is more important to us than 300 MB of user memory.

It's closer to 1GB but trust me, everyone is well aware of your priorities.


"I would rather spend the user's money than my engineer's time"

Teams works similarly in browser tab and "natively". Slack was similar if I remember correctly.

You should check the memory use of that browser tab. You’re not saving much either way running in a browser or in Electron, which is effectively a browser.

I only ever use Discord in a browser window.

"cursor 3" is just a landing page. The editor is still the old vscode fork...

Are you sure about Cursor? I haven't seen anything about that, I think it's still based on VSCode/electron.

No. I saw they rebuilt it with some rust involved. It's no longer a vscode fork.

As if native apps are any better. Books app on my mac takes 400MB without even having a single book open.

I find that an exaggerated claim, have you really checked they aren't using a webview or some other non-native runtime?

I did't but that's exactly my point.

Native apps are so poorly optimized that they don't offer any advantage over Electron apps.


Won't happen. People are ok with swapping to their SSDs, Macbook Neo confirms that

Hopefully just kill off the javascript for everything mindset to be honest.

You do need a couple framebuffers, but for the most part yeah...

Who cares about 300Mb, where is that going to move the needle for you? And if the alternative is a memory-unsafe language then 300Mb is a price more than worth paying. Likewise if the alternative is the app never getting started, or being single-platform-only, because the available build systems suck too bad.

There ought to be a short one-liner that anyone can run to get easily installable "binaries" for their PyQt app for all major platforms. But there isn't, you have to dig up some blog post with 3 config files and a 10 argument incantation and follow it (and every blog post has a different one) when you just wanted to spend 10 minutes writing some code to solve your problem (which is how every good program gets started). So we're stuck with Electron.


There's a world of difference between using a memory safe language and shipping a web browser with your app. I'm pretty sure Avalonia, JavaFX, and Wails would all be much leaner than electron.

The people who hate Electron hate JavaFX just as much if not more, and I'm not sure it would even use less memory. And while the build experience isn't awful, it's still a significant amount of work to package up in "executable" form especially for a platform different from what you're building on, or was until a couple of years ago. And I'm pretty sure Avalonia is even worse.

> and I'm not sure it would even use less memory

It likely would use less, and doesn't use a browser for rendering.

> And I'm pretty sure Avalonia is even worse

Definitely not

> The people who hate Electron hate JavaFX just as much if not more

In my opinion, I only see this from people that seem to form all of their opinions on tech forums and think Java=Bad. These are the people that think .NET is still windows only and post FUD because they don't know how to just ask for help.


> And if the alternative is a memory-unsafe language

and if not?


> and if not?

If the alternative is memory-safe and easy to build, then maybe people will switch. But until it is it's irresponsible to even try to get them to do so.


Until? Just take what's out there - it's so easy to improve on Electron

Like what? Where else (that's a name brand platform and not, like, some obscure blog post's cobbled-together thing) can I start a project, push one button, and get binaries for all major platforms? Until you solve that people will keep using Electron.

There are quite a few options. Many of them look dated though. I think that's the usp of electron.

Hope dies last, as they say.

Then again, after many, many years of claims that the following year would be the year of the Linux Desktop, there seems to be more and more of a push into that direction. Or at least into a significant increase in market share. We can thank a current head of state for that.


like 1973 oil crisis? End of V8 engines (pun intended)

Yeah, like that. Modern engines eclipsed pre-1970s engines in performance, efficiency, and even (yes I said it) in reliability.

At a cost of simplicity and beauty. And two lost decades of mediocre performance. Sigh


The demand is being driven by inference though. I really don't think there will be much motivation.

The large models are incredibly inefficient. We'll be squeezing them down for generations.

Right, that's where the major push is right now. Not with shrinking down some code libraries.

oh that would be a dream

Using a lot less RAM often implies using more CPU, so even with inflated RAM prices, it's not a good tradeoff (at least not in general).

In practice, you generally see the opposite. The "CPU" is in fact limited by memory throughput. (The exception is intense number crunching or similar compute-heavy code, where thermal and power limits come into play. But much of that code can be shifted to the GPU.)

RAM throughput and RAM footprint are only weakly related. The throughput is governed by the cache locality of access patterns. A program with a 50MB footprint could put more pressure on the RAM bus than one with a 5GB footprint.

You're absolutely right? I don't really disagree with anything you're saying there, that's why I said "generally" and "in practice".

Reducing your RAM consumption is not the best approach to reducing your RAM throughput is my point. It could be effective in some specific situations, but I would definitely not say that those situations are more common than the other ones.

I don't understand how this connects to your original claim, which was about trading ram usage for CPU cycles. Could you elaborate?

From what I understand, increasing cache locality is orthogonal to how much RAM an app is using. It just lets the CPU get cache hits more often, so it only relates to throughout.

That might technically offload work to the CPU, but that's work the CPU is actually good at. We want to offload that.

In the case of Electron apps, they use a lot of RAM and that's not to spare the CPU


> increasing cache locality is orthogonal to how much RAM an app is using. It just lets the CPU get cache hits more often, so it only relates to throughout.

Cache misses mean CPU stalls, which mean wasted CPU (i.e. the CPU accomplises less than it could have in some amount of time).

> In the case of Electron apps, they use a lot of RAM and that's not to spare the CPU

The question isn't why apps use a lot of RAM, but what the effects of reducing it are. Redcuing memory consumption by a little can be cheap, but if you want to do it by a lot, development and maintenance costs rise and/or CPU costs rise, and both are more expensive than RAM, even at inflated prices.

To get a sense for why you use more CPU when you want to reduce your RAM consumption by a lot, using much less RAM while allowing the program to use the same data means that you're reusing the same memory more frequently, and that takes computational work.

But I agree that on consumer devices you tend to see software that uses a significant portion of RAM and a tiny portion of CPU and that's not a good balance, just as the opposite isn't. The reason is that CPU and RAM are related, and your machine is "spent" when one of them runs out. If a program consumes a lot of CPU, few other programs can run on the machine no matter how much free RAM it has, and if a program consumes a lot of RAM, few other programs can run no matter how much free CPU you have. So programs need to aim for some reasonable balance of the RAM and CPU they're using. Some are inefficient by using too little RAM (compared to the CPU they're using), and some are inefficient by using too little CPU (compared to the RAM they're using).


> Cache misses mean CPU stalls, which mean wasted CPU (i.e. the CPU accomplises less than it could have in some amount of time).

Yeah, I was saying CPU cache hits would result in better performance. The creator of Zig has argued that the easiest way to improve cache locality is by having smaller working sets of memory to begin with. No, it's not a given this will always work in every case. You can reduce working memory and not have better cache locality. But in a general sense, I understand why he argues for it.

> So programs need to aim for some reasonable balance of the RAM and CPU they're using

I agree with this, but

> but if you want to do it by a lot, development and maintenance costs rise and/or CPU costs rise, and both are more expensive than RAM, even at inflated prices

I would like you to clarify further, because saying CPU costs are more expensive than RAM costs is a bit misleading. A CPU might literally cost more than RAM, but a CPU is remarkably faster, and for work done, much cheaper and more efficient, especially with cache hits.

You had originally said

> It could be effective in some specific situations, but I would definitely not say that those situations are more common than the other ones

This is what I'm confused on. Why do you think most cases wouldn't benefit from this? Almost every app I've used is way on one end of the spectrum with regards to memory consumption vs CPU cycles. Don't you think there are actually a lot of cases where we could reduce memory usage AND increase cache locality, fitting more data into cache lines, avoiding GC pressure, avoiding paging and allocations, and the software would 100% be faster?


> But in a general sense, I understand why he argues for it.

Andrew is not wrong, but he's talking about optimisations with relatively little impact compared to others and is addressing people who already write software that's otherwise optimised. More concretely, keeping data packed tighter and reducing RAM footprint are not the same. The former does help CPU utilisation but doesn't make as big of an impact on the latter as things that are detrimental to the CPU (such as switching from moving collectors to malloc/free).

> Why do you think most cases wouldn't benefit from this?

The context to which "this" is referring to was "Reducing your RAM consumption is not the best approach to reducing your RAM throughput is my point." For data-packing, Andy Kelley style, to reduce the RAM bandwidth, the access patterns must be very regular, such as processing some large data structure in bulk (where prefetching helps). This is something you could see in batch applications (such as compilers), but not in most programs, which are interactive. If your data access patterns are random, packing it more tightly will not significantly reduce your RAM bandwidth.


> Andrew is not wrong... and is addressing people who already write software that's otherwise optimised

I'm getting lost. What are we talking about if not that? Because if you're talking about unoptimized software, you can absolutely reduce RAM consumption without putting extra load on the CPU. Using a language that doesn't box every single value is going to reduce RAM consumption AND be easier on the CPU. Which is what most people are talking about on this post.

> The context to which "this" is referring to was "Reducing your RAM consumption is not the best approach to reducing your RAM throughput is my point."

I'm more interested in the original claim, which was

> Using a lot less RAM often implies using more CPU

There are a lot of apps using a lot of RAM, and it's not to save CPU. So where is "often" coming from here? I think there are WAY more apps that could stand to be debloated and would use less CPU.

It feels like you're coming at this from a JVM perspective. Yeah, tweaking my JVM to use less RAM would result in more CPU usage. But I don't think there's a single app out there as optimized as the JVM is. They use more RAM for other reasons.

> If your data access patterns are random, packing it more tightly will not significantly reduce your RAM bandwidth

Packing helps random access too. A smaller working set means more of your random accesses land in cache. Prefetching is one benefit of packing, but cache and TLB pressure reduction is the bigger one, and it applies regardless of access pattern


> Using a language that doesn't box every single value is going to reduce RAM consumption AND be easier on the CPU. Which is what most people are talking about on this post.

What popular language does that? I admit that rewriting the software in a different language could lead to better efficiencies on all fronts, but such massive work is hardly "an optimisation", and there are substantial costs involved.

But more importantly, I don't think it's right. Removing boxing can certainly have an impact on RAM footprint without an adverse effect on CPU, but I don't think it's a huge one. RAM footprint is dominated by what data is kept in memory and the language's memory management strategy (malloc/free vs non-moving tracing collectors vs moving collectors), and changing either one of these can very much have an adverse effect on CPU.

> There are a lot of apps using a lot of RAM, and it's not to save CPU. So where is "often" coming from here?

That the developers may not be conscious of the RAM/CPU tradeoff doesn't mean it's not there. Keeping less data in memory (and computing more of it on demand) can increase CPU utilisation as can switching from a language with a moving collector to one that relies on malloc/free.

> Packing helps random access too. A smaller working set means more of your random accesses land in cache.

Unless your entire live set fits in the cache, what matters much more is the temporal locality, not the size of the live set. If your cache size is 50MB, a program with a 1GB live set could have just as many or just as few cache misses as a program with a 100MB live set. In other words, you could reduce your live set by a factor of 10 and not see any improvement in your cache hit rate, and you can improve your cache hit rate without reducing your live set one iota.

For example, consider a server that caches some session data and evicts it after a while. Reducing the allowed session idle time can drastically reduce your live set, but it will barely have an effect on cache locality.

Tighter data layouts absolutely improve cache behaviour, but they don't have a huge effect on the footprint. Coversely, what data is stored in RAM and your memory management strategy have a large effect on footprint but they don't help your cache behaviour much. In other words, Andy Kelley's emphasis on layout is very important for program speed, but it's largely orthogonal to RAM footprint.


I don't really disagree with most of what you're saying, What I took issue with: you made it sound like software is a trade off between just RAM and CPU. What is clear is it's a trade off between RAM, CPU, and abstractions (safe memory access, dev experience, etc.) My feeling, and the feeling of most people, is that dev experience has been so heavily prioritized that we now have abstractions upon abstractions upon abstractions, and software that does the same thing 20 years ago is somehow leaner than the software we have today. The narrow claim "within a fixed design, reducing RAM often costs CPU," is true.

> What popular language does that?

Other than C, Rust, Go, Swift? C# can use value types, Java cannot. So famously that Project Valhalla has been highly anticipated for a long time. Obviously the JVM team thinks this is a gap and want to address it. That is enough in itself to make someone consider a different language.

> I admit that rewriting the software in a different language could lead to better efficiencies on all fronts, but such massive work is hardly "an optimisation", and there are substantial costs involved

That's a pivot to a totally different discussion, which is dev experience. We can say using a different language is not an optimization, I don't care to argue about that. But the fact is some languages have access to optimizations others do not. My dad has 8gb of RAM. I'm not going to install a JavaFX text editor on his computer and explain to him that "it's really quite good value for what the JVM has to do."

> Removing boxing can certainly have an impact on RAM footprint without an adverse effect on CPU, but I don't think it's a huge one

Removing boxing can improve layout, footprint, and CPU utilization simultaneously. That would lie outside the framework "You can't improve one without harming the other."

And it can be a huge effect. Saying it's always a big or small difference is like saying a stack of feathers can never be heavy. It depends on the use case. For a long-running server dominated by caches and session state, sure, although you're not hurting your performance to do it. For data heavy code? The difference between a HashMap<Long, Long> and an equivalent contiguous structure in C# is huge.

>> There are a lot of apps using a lot of RAM, and it's not to save CPU. So where is "often" coming from here? > That the developers may not be conscious of the RAM/CPU tradeoff doesn't mean it's not there

I'm saying Electron uses a lot of RAM and it has nothing to do with offloading work from the CPU, and everything to do with taking the most brute force approach to cross app deployment that we possibly can. I'm not saying anything about the intentions of these developers.

> Unless your entire live set fits in the cache, what matters much more is the temporal locality, not the size of the live set. If your cache size is 50MB, a program with a 1GB live set could have just as many or just as few cache misses as a program with a 100MB live set. In other words, you could reduce your live set by a factor of 10 and not see any improvement in your cache hit rate, and you can improve your cache hit rate without reducing your live set one iota

That's all true. You are fitting more data into each cache line, but your access pattern can be random enough that it doesn't make a difference. It would technically reduce your ram footprint, but as you say, not by much. I only brought this up as an example of something that could reduce RAM footprint without harming CPU utilization, not because it's a worthwhile optimization.

But one way to shrink the live set and improve cache behavior at the same time is to stop boxing everything.


Sorry this is long, but you successfully nerd-sniped me :)

> Other than C, Rust, Go, Swift? C# can use value types, Java cannot. So famously that Project Valhalla has been highly anticipated for a long time. Obviously the JVM team thinks this is a gap and want to address it. That is enough in itself to make someone consider a different language.

As someone working on the JVM, I can tell you we're very much interested in Valhalla and largely for cache-friendliness reasons, but Java certainly doesn't box every value today, and you are severely overstating the case. If you think you can save on both RAM and CPU by preferring a low-level language (or Go, which is slower almost across the board), you're just wrong. But I want to focus on the more important general point you made first.

> My feeling, and the feeling of most people, is that dev experience has been so heavily prioritized that we now have abstractions upon abstractions upon abstractions, and software that does the same thing 20 years ago is somehow leaner than the software we have today. The narrow claim "within a fixed design, reducing RAM often costs CPU," is true.

The problem here is that in some situations there's truth to what you're saying, but in others, it is just seriously wrong. I think the misconception comes precisely because "most poeple" these days don't have the long experience with low level programming that people in my generation of developers do, and you're not aware that many of these abstractions are performance optimisations that come from deep familiarity with the performance issues of low-level programming (I started out programming in C and X86 Assembly, and in the first long job of my career I was working on hard- and soft-realtime radar and air traffic control systems in C++).

Low-level languages aren't meant to be fast (and aren't particularly fast). They're meant to give you direct control over the use of hardware. When it comes to small software, this control does frequently translate to very good performance, but as programs get larger, it makes low-level languages slow. It is true that Java was intended to help developer productivity, but it's also meant to solve some of the intrinsic performance issues in low-level languages, which it does rather well. After all, our team has been made up of some of the world's biggest experts in optimising compilers and memory management, and removing some of C++'s overheads is very much a central goal.

So where do things go wrong for low-level languages? The core problem is that these languages split constructs into fast and slow variants, e.g. static vs dynamic dispatch and stack vs heap allocation. The programmer needs to choose between them. What happens is:

1. As programs grow larger and more complex, the direction is almost completely monotonical in the direction of the more expensive, and more general, variants.

2. There is a big difference between "a fast program could hypothetically be written" and "your program will be fast". Getting good perfomance out of low-level languages requires not only experience, but a lot of effort. For example, you can write a small benchmark and see that malloc/free are pretty fast these days, but that's often true only for the benchmark, where objects tend to be of the same size, and their allocation and deallocation patterns are regular. Memory allocators degrade over time, and they're quite bad when patterns are irregular, which is what happens in real programs, especially large ones. There's also the question of meticulous care around correctness. When Rust first came out I was very excited to see a few important correctness issues solved without loss of control, but was then severely disappointed. Almost anything that is interesting from a performance perspective for us low-level programmers requires unsafe. Even a good hashmap requires unsafe. The performance cost of safety in Rust is higher than it is in Java, and non-experts end up writing slower programs (when they're not small at least).

Such performance issues have plagued low-level programming forever, and Java is reducing these overheads. The idea that high abstractions can improve performance was possibly first stated in Andrew Appel's paper, "Garbage Collection Can Be Faster than Stack Allocation" in the eighties, in which he wrote: "It is easy to believe that one must pay a price in efficiency for this ease in programming... But this is simply not true."

Instead of a static/dynamic dispatch split, Java offers only the general construct (dynamic), and the compiler can "see through" dynamic dispatch and inline it better than any low-level compiler ever could. You can say that surely there has to be some tradeoff, and there is, but not to peak performance. The tradeoff is that 1. you lose control and can't guarantee that the optimisation will be made, so you get good average performance but maybe not the best worst-case performance (which is why it's not hard to beat Java in small programs if you know what you're doing), 2. the compiler needs to collect profiles as the program runs, which results in a "warmup" period.

(If, like me, you like Zig, you might have seen Kelley talk about the "vtable barrier" in low level languages; this doesn't exist in Java. You may also be interested in this talk, "How the JVM Optimizes Generic Code - A Deep Dive", by John Rose: https://youtu.be/J4O5h3xpIY8,

As for memory, not only do moving collectors do not degrade (or fragment) over time, they can use the RAM chip as a hardware accelerator. Unfortunately, when a program uses the GPU for acceleration it's considered clever, but when it uses the RAM chip for accelaration it's considered bloated, even though every CPU core these days comes with at least 1GB of RAM that you might as well use if you're using up the core, as that's effectively free.

The people who consider that bloated are mostly those who haven't struggled with low-level programming long enough or on software that's large enough (they're people who say, I wrote this lean and fast gizmo by myself in 5 months; 99% of value delivered by software is in software written by large teams and maintained over many years). When I was working on a sensor-fusion and air-traffic control software in the nineties, it wasn't "lean"; we just had no choice. We constantly had to sacrifice performance for correctness. Of course, once machines got better, we switched to Java for better performance. God could have written a faster program of that size in C++, but not a large team made up of people with different levels of experience. People who think C++ (or Rust) is particularly efficient are people who haven't written anything big and long-maintained with it.

In conclusion:

1. Sometimes layers of abstractions add performance overheads, and sometimes they remove it. It is not generally true that more abstraction/generality have a performance cost, especially when comparing different languages, although it is almost always true within one language (e.g. dyanmic dispatch is never faster than static dispatch, and is often slower, in C++, but dynamic dispatch in Java can be faster than even static dispatch in C++, and the tradeoffs are elsewhere). If you didn't believe that, you'd be writing all your code in Assembly (which is what I did to get the fastest programs in the early nineties, but it's just not generally faster today thanks to good optimisation algorithms in compilers).

2. Low-level languages give you control, not speed. This control typically translates to better performance in small programs and to worse performance in large ones. This performance problem is intrinsic to low-level programming.

> Removing boxing can improve layout, footprint, and CPU utilization simultaneously. That would lie outside the framework "You can't improve one without harming the other."

First, the footprint won't reduce by much. E.g., in Java, boxing could cost you 10% of your footprint, but the RAM-assisted acceleration could be 80% of the footprint.

Second, yes good layouts help CPU utilisation, but today you can't get that without giving up on other things that harm performance. Dynamic dispatch and memory management in C++ and Rust are just too slow, and while Zig can be blazing fast, it's not easy to write large software in it without compromising performance any more than in any other low-level language. I hope that with Valhalla, Java will be the first language to let you enjoy everything at once, but it's not really an option today.

> I'm saying Electron uses a lot of RAM and it has nothing to do with offloading work from the CPU, and everything to do with taking the most brute force approach to cross app deployment that we possibly can.

That developers choose it because it's "brute force approach to cross app deployment" doesn't necessarily mean that it doesn't also offload work from the CPU, but yes, Electron apps are probably very inefficient from some perspectives. But I think this is also overstated by people who are overly sensitive. When we say something is inefficient, it means that we spend on it more than we have to, but what we really mean is that we could spend that resource that we save on something else instead. On my M1 laptop, I comfortably run three electron apps and two browsers simultaneously without much harming the speed at I can, say, compile HotSpot, probably because SSDs are fast enough for virtual memory in interactive GUIs. I can't think of anything else I could use my laptop's resources for if the apps were leaner on RAM. Reducing the consumption of a resource that can't be meaningfully used for other work isn't real efficiency, and if it comes at the expense of anything useful, it's downright inefficient.


Well I'm glad you were nerd sniped, I appreciate the response. I've learned a lot and it's a good resource for people. I know you're an expert here. Most of the conversation for me has been clarifying my confusion based on my model of programming. There are parts that are way out of depth for me, but I'm trying to focus on what I do understand, and I'm still greatly confused on some of your claims.

I understand the JVM is not only very efficient, but the JIT gives it unique opportunities to optimize where a compiled language couldn't. You may not get those optimizations consistently, but you don't necessarily need to go into that level of minutia.

You also pointed out that these JIT characteristics can be easily gamed against Java in microbenchmarks, so it's not difficult to make Java look slower than it is in a complex application.

That being said, I am not understanding this narrative that low level projects, as they grow, always devolve into an inefficient dynamic soup. The Linux kernel is millions of lines and uses function pointers sparingly and deliberately. SQLite is huge, mature, and almost entirely static. High-frequency trading systems, embedded software, browser rendering engines, database storage layers. There are entire industries of large, long-lived, performance-critical codebases that do not "devolve" into dynamic dispatch.

If you're saying it's just hard to do that, and Java makes it easy to get close enough with its already dynamic model, then fine. But if you're saying this is an inherent problem as low level programs grow, I would like to understand why.

But you also said

> If you think you can save on both RAM and CPU by preferring a low-level language (or Go, which is slower almost across the board), you're just wrong

Really? Ignoring gamed benchmarks, I don't think it's controversial to say Rust consistently beats Java at the same tasks in RAM and CPU. Maybe that's not important to you because they are too small and you're talking about what happens to programs as they grow in complexity. So I'd like to hear more about why you wrote I'm wrong.

I mean Java and Go are pretty much neck and neck here, with Go using way less RAM - https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

> Second, yes good layouts help CPU utilisation, but today you can't get that without giving up on other things that harm performance

Like what? I'm not understanding. You seem to be implying that without boxing we'd be stuck with a lot of dynamic dispatch and fragmented memory, and I'm not seein the connection.

I brought up unboxing because pointer chasing is expensive and trying to make a collection in Java that you can efficiently loop through can be a frustrating thing.

? Does it not box

> Electron apps are probably very inefficient from some perspectives. But I think this is also overstated by people who are overly sensitive

I also have an m1 laptop and can run things fine. But I'm probably not going to budge on that, because I am consistently exposed to people with low RAM systems, and they are forced to use stuff like Teams in their day to day. Yes, I understand it's cross platform and saves on dev time. Nobody likes using WinForms. But I think Electron has been a net negative on the ecosystem of apps for people with ok computers.


> That being said, I am not understanding this narrative that low level projects, as they grow, always devolve into an inefficient dynamic soup. The Linux kernel is millions of lines and uses function pointers sparingly and deliberately.

Low-level languages are designed for direct and complete control over hardware, and that is also the job of an OS kernel. Their level of abstraction is a perfect match. But the things at which low-level languages are slow - heap allocations and dynamic dispatch - are exactly the things that applications (not kernels) naturally gravitate towards needing over time.

Of course, it's possible to keep redesigning the architecture as the software evolves to avoid low-level languages' slow operations, but that costs a lot. This isn't some new discovery. The motivations for Java's bet on a JIT and moving collectors were a result of seeing what happened with C++: it was very easy to write nice-looking and fast programs. It was very hard and very costly to keep them that way over time.

> SQLite is huge, mature, and almost entirely static

SQLite is not only not huge but is quite small. ~150KLOC.

> I don't think it's controversial to say Rust consistently beats Java at the same tasks in RAM and CPU.

I don't know if it's controversial, but it's certainly very wrong.

Let's look at one of the most famous terrible benchmarks: The Computer Language Benchmarks Game (it's terrible not only because it compares different algorithms, but also because it has no benchmarks that are long-running, none with interesting memory management, and no concurrent benchmarks - the very things most programs today do): https://benchmarksgame-team.pages.debian.net/benchmarksgame/... In all but one, the C++ and Java results are mixed, i.e. some Java entries are faster than some C++ entries and vice-versa, and this is despite the benchmarks penalising JITs and being minuscule, which is where low-level languages shine. This goes to my point about the important difference of "some program can be very fast" vs. "your program will be fast". Low level languages and Java are on different sides of the tradeoff here: low-level languages focus on control, which often means "someone could write fast code", while Java focuses on compiler and runtime optimisations of high abstractions with the goal of making your code fast.

If we look at another famous benchmark, techempower, we see the same thing: Java, Rust, and C++ results are intermixed, despite the benchmarks being small and thus favouring low-level languages: https://www.techempower.com/benchmarks/#section=data-r23

Of course, there aren't cross-language application benchmarks, i.e. benchmarks that measure the performance developers really care about. All I can say is that a developer of one of the world's largest tech companies told us that his new team lead wanted to migrate some service from Java to Rust for the performance. What happened was that they experienced a large drop in performance, but to save face, they spent 6-12 months carefully optimising the Rust code, and in the end managed to match, though not exceed, Java's performance.

C++ and Rust are simply not particularly fast for applications, and Java is. It's possible to spend a lot of effort optimising them, but it's effort that needs to be spent continuously as the program evolves. That's exactly what led compilation and memory management experts to design the JVM the way they did in the first place: It's hard to make low-level code efficient for large applications.

> I mean Java and Go are pretty much neck and neck here, with Go using way less RAM

Go uses way less RAM because it uses an inefficient non-moving collector, which is why you see Go shops complaining constantly about the poor performance of Go's GC and why they try to avoid it (as Java developers used to do in the past). The speed is similar only because the benchmarks are not very interesting, but while, broadly speaking, C++, Java, and Rust are roughly at the same time "performance level" (ignoring all the tradeoffs I mentioned before), Go is strictly in a lower class. While you have to get pretty large to see Java beating C++ and Rust, it's fairly easy to see Java leaving Go in the dust even on fairly small programs. The programs just need to be a little more interesting than those in the Benchmarks Game.

But I don't think Go is even playing the same game. Its goal wasn't to be a super-optimised language that takes advantage of progress in compilation and memory management technologies. It was meant to be good enough for some things while keeping a small and simple implementation. It's faster than Python and JS, and that's the goal. It's not really trying to compete with C++/Java on performance.

> Like what? I'm not understanding. You seem to be implying that without boxing we'd be stuck with a lot of dynamic dispatch and fragmented memory, and I'm not seein the connection.

I'm saying that the languages that give you good layout today happen to be languages that are bad at other things (like memory management, dynamic dispatch, and concurrent data structures). So if you win in one area you lose in another (but depending on the program, some of these areas may matter more than others).

> Does it not box?

In Java, valus in an int/long/double/etc. array or fields in a class like `class A { int a, b; boolean c; String d; }` are just as boxed as they are in C++, which is to say they're not. Instances of the class will not be flattened into arrays or fields, which is exactly why we have Valhalla, but the problem is not that severe in big program (which is why we haven't dropped everything to just do Valhalla). Also, remember that boxing has a cost in low-level languages beyond cache-locality - due to heap allocations - that don't exist (at least not as significantly) in Java. Boxing in Java is much cheaper than it is in C++/Rust, except fot the cache locality cost, but while in some programs that can be a problem, in many that's not the main one.

> I also have an m1 laptop and can run things fine. But I'm probably not going to budge on that, because I am consistently exposed to people with low RAM systems, and they are forced to use stuff like Teams in their day to day

Of course if you deploy a program that uses a lot of resource X to machines where X is more restricted than the other resources the program uses, you should optimise the consumption of X.

> But I think Electron has been a net negative on the ecosystem of apps for people with ok computers.

That depends on what else these people want to use their computers for while running an Electron app. By far the largest group of people I've seen complain are people here on HN who like counting MBs rather than look at the overall utilisation picture.


> The Computer Language Benchmarks Game (it's terrible not only because it compares different algorithms, but also because it has no benchmarks that are long-running, none with interesting memory management, and no concurrent benchmarks - the very things most programs today do)

It also compares un-optimised single-thread #8 programs transliterated line-by-line from the same original.

However long (programs run) they never seem to become "long-running".

There's always some programmer who replaces "interesting memory management" with array and int.(Many complaints about Go binary-trees programs seemed to be: they should implement a custom arena.)

What does "no concurrent benchmarks" mean when:

    import java.util.concurrent.CyclicBarrier; 

> Of course, there aren't cross-language application benchmarks

Maybe something like

https://link.springer.com/article/10.1186/s12859-019-2903-5


> However long (programs run) they never seem to become "long-running".

Most application servers are expected to run without issue for at least a day. Our acceptance tests run high workloads for 1, 7, and 30 days. The longest running Benchmarks Game benchmark doesn't break one minute. You can maybe argue whether long running is 3 hours or 3 days, but under one minute isn't long running by anyone's definition.

> What does "no concurrent benchmarks" mean when: import java.util.concurrent.CyclicBarrier;

I believe it's used to coordinate parallelism. Parallelism (where tasks cooperate) and concurrency (where they compete) result in completely different machine workloads.

> Maybe something like https://link.springer.com/article/10.1186/s12859-019-2903-5

It's obviously more interesting than the benchmarks game as it exercises things in a more realistic way, but as much as I like seeing Java winning as it did in this benchmark [1] (even an ancient version of Java, before the new GC generations and new compiler optimisations) it's still very small, and as a batch program, not very representative of most software people write.

The problem with benchmarks is that they tell you how fast a specific program is (the benchmark itself) but it's very hard to generalise from that result to what you're interested in, unless the benchmark is very similar to your program (microbenchmarks never are; larger benchmarks could be, but the space is large so you need to be lucky).

[1]: It's interesting that they made a common mistake when interpreting the results. The program seems to try to get the CPU to 100%. In this situation it's not hard to see that a program that runs even 1% faster and uses 10x more memory is more memory efficient than a program that's 1% slower and uses 10x less memory. That's because while a program runs at 100% CPU, no RAM can be used for any purpose by any other program. So either way you capture 100% of RAM, but in one case you capture it for less time. This idea is at the core of using RAM chips as hardware accelerators (using up CPU effectively uses up RAM because using RAM requires CPU cycles).


> … whether long running is 3 hours or 3 days…

JavaOne long ago, there would be mixed messages: both "So a benchmark that ends in less than 10 sec probably does not measure anything interesting." and in blog post benchmarks "100000000 hashes in 5.745 secs … 100000000 primes in 1.548 secs"

(Goldilocks would know.)

> … different machine workloads…

I'm happy to accept that you didn't mean no parallel programs.

> … very hard to generalise …

Indeed.

https://www.larcenists.org/Twobit/bmcrock.temp.html


> Low-level languages are designed for direct and complete control over hardware, and that is also the job of an OS kernel. Their level of abstraction is a perfect match. But the things at which low-level languages are slow - heap allocations and dynamic dispatch - are exactly the things that applications (not kernels) naturally gravitate towards needing over time.

Hmm. To put a pin in this, you're saying the following is harder to do as an application grows in complexity: - Avoiding lots of little allocations (using arenas, value types) - Avoiding dynamic dispatch

Two things that Java doesn't need to avoid, because it's optimized for it. What I'm unclear on is the perspective that it's inevitable. I don't know what scale of apps you're talking about.

In the case of Rust, it has Arenas, stack allocated structs, and generics via monomorphization. Not only can you avoid both of these things, it doesn't even seem that difficult. If you're saying the borrowchecker just becomes too cumbersome to do for sufficiently large applications, that's fine.

> What happened was that they experienced a large drop in performance, but to save face, they spent 6-12 months carefully optimising the Rust code, and in the end managed to match, though not exceed, Java's performance.

There's really not enough detail here to draw from it. But they got to the same performance as Java with less RAM, at the cost of dev experience. Does that not support what I said? Maybe that 6-12 months for the same performance was way too much, but for a smaller app, that could actually be a worthy tradeoff, no? Like...a desktop app?

> Let's look at one of the most famous terrible benchmarks: The Computer Language Benchmarks Game (it's terrible not only because it compares different algorithms, but also because it has no benchmarks that are long-running, none with interesting memory management, and no concurrent benchmarks - the very things most programs today do)

If Benchmark game is not long enough, dynamic enough, or allocate-y enough, then it's not worth talking about. But what is worth talking about?

This is difficult because you won't accept benchmarks that are too small because they are unfair to the JIT, but you also won't accept applications that are too small. Apparently Sqlite, 150k LOC, is not big enough to be relevant to this discussion. So all we have is anecdotal experience that Java is more performant for large, long lived processes with many contributors. I've certainly read a lot of reports from people rewriting to Rust and getting much leaner applications, but maybe that they weren't working on apps complex enough to force them into dynamic dispatch or many allocations. Or maybe it was pure cope. I don't know because their anecdotes and your anecdotes are not very detailed.

But see how far we have drifted from what the original claim was, which is the claim you cannot reduce RAM consumption without harming CPU utilization. You said Rust isn't particularly fast, but Java is. That's a really strong claim considering we have painted a much narrower scope of when that's true. Most of us aren't working on 3M LOC faang apps. We are working on smaller things, or desktop apps. In those cases, the ratio of RAM consumption to CPU efficiency is much better in Rust than it is in Java. Isn't using Java just a straight up RAM loss for those cases?

> That depends on what else these people want to use their computers for while running an Electron app. By far the largest group of people I've seen complain are people here on HN who like counting MBs rather than look at the overall utilisation picture

That doesn't seem terribly fair. Grandma doesn't have the vocabulary to complain about RAM, true. But her computer is slow, she asked her grandson for help, and her grandson told her to use Spotify in the browser, not download the app. And now she has to be mindful of what she has open, even though 8gb of RAM is actually a lot, we've just lost sight of it.

> Of course if you deploy a program that uses a lot of resource X to machines where X is more restricted than the other resources the program uses, you should optimise the consumption of X

The problem is CPU and RAM usage are fundamentally different. I don't get to know what will be run with my program, so I don't get to know how restricted RAM is. If a computer is CPU limited, at the very least, it won't be terribly busy with programs that aren't being used. But for most programs, RAM is allocated and then just sitting there, whether the program is being used or not. So deploying an Electron app kinda feels like a middle finger to your users, because even though it doesn't need to, it limits the amount of programs they can have open. Not to save on CPU, but because Chromium needs that RAM to work in the first place. It's purely a dev experience decision because people don't know how to ship desktop apps. It's pure waste for the user.


> In the case of Rust, it has Arenas, stack allocated structs, and generics via monomorphization. Not only can you avoid both of these things, it doesn't even seem that difficult. If you're saying the borrowchecker just becomes too cumbersome to do for sufficiently large applications, that's fine.

Not really.

First, let's look at stack-allocated structs and think about how much data can live in them. The typical stack size is 2MB, but because the only live data in stacks are in caller functions, we can say that on average, the amount of live data that a stack holds is 1MB. Now, look at how much RAM an application uses in MBs and divide it by the number of threads. Usually, the ratio is much higher than RAM, which means that data in stacks is not a significant portion of the program's data. (Async changes this calculus a bit, but async is extremely limited in Rust as it doesn't allow recursion, FFI, or dynamic dispatch; proper user-mode threads, like the ones in Java and Go actually make stack allocation more useful.)

Now let's look at arenas. Arenas are extremely efficient because they offer a similar RAM/CPU tradeoff knob as moving GCs, but they're not as general (if your allocation pattern supports them well, they're great, but you can't generally use them). But in Rust, things are much worse, because arenas are quite limited; too many standard-library data structures, including strings, vectors, and maps can't easily be plugged into an arena. The only language that gives you arenas' full power (which, again, is not completely general) is Zig. This is one of the reasons hardcore low level programmers find Rust so underwhelming (the other being that too many things that are important in low-level programming, including basic data structures but also benign concurrency, require unsafe).

> But they got to the same performance as Java with less RAM, at the cost of dev experience.

You say "dev experience" as if it's some quality-of-life thing. They traded off a cheap resource, RAM, for an eternal maintenance and evolution cost that would only grow higher as the program grows. And remember that RAM isn't entirely fungible. It's hard to get less than 1GB per core (either in bare metal or in cloud VMs/containers) so using less RAM often saves you $0.

> but for a smaller app, that could actually be a worthy tradeoff, no? Like...a desktop app?

I said that low-level languages can offer good performance in small programs, but many desktop apps aren't small. Claude Code's CLI is over 500KLOC.

> So all we have is anecdotal experience that Java is more performant for large, long lived processes with many contributors

If anything, benchmarks are much more anecdotal. Not only are there fewer benchmarks than applications, but they don't even resemble real programs. But yeah, ever since operations lost their intrinsic costs some 20 years ago - with CPU cache hierarchies, branch prediction, and ILP, more powerful optimising compilers, and more elaborate GCs/memory allocators, the ability to generalise from one program to another is close to nil. So yeah, experience is all we have to go with. Going with the numbers we have (some benchmarks that don't extrapolate) rather than the numbers we need but don't have doesn't help.

I can tell you that the loss of intrinsic operation costs has made our lives as compiler/runtime developers much harder, because we can no longer tell people that this operation is generally fast or generally slow. But that doesn't change the fact that this is our reality.

> what the original claim was, which is the claim you cannot reduce RAM consumption without harming CPU utilization

No. I wrote, and I quote: "Using a lot less RAM often implies using more CPU."

> That's a really strong claim considering we have painted a much narrower scope of when that's true.

Did we? Most of software (measured by the distribution of paid programmers) is in large applications.

> Most of us aren't working on 3M LOC faang apps. We are working on smaller things, or desktop apps

I don't think that's true at all. Forget FAANG. Most software isn't written by software companies at all, but is in-house software (well, Netflix isn't a software company, so I guess it's one FAANG letter). The bulk of software is in things like telecom management and billing, banking and finance, manufacturing control, logistics and shipping, healthcare and hospitality, retail and payment processing, travel, government, defence. 3MLOC is quite typical. People who work on smaller software are overrepresented in Silicon Valley and, I'm guessing, among HN readers, but they're the outliers.

> Grandma doesn't have the vocabulary to complain about RAM, true. But her computer is slow, she asked her grandson for help, and her grandson told her to use Spotify in the browser, not download the app. And now she has to be mindful of what she has open, even though 8gb of RAM is actually a lot, we've just lost sight of it.

Does she, though? I doubt she's running anything intensive in the background, so she's really only using one program at a time, and SSDs are fast enough these days to page in virtual memory when she switches programs, unless the one program she's currently using eats up the 8GB. I agree that if her OS - the one thing she needs to run in the background - is taking up a lot of RAM that could be a problem, but the OS is special. Her computer is slow not because shes using a program that eats up a lot of RAM, but because she's inadvertently running a lot of stuff in the background that shouldn't be running at all (browser plugins? some programs that add themselves as login items?) A Surface Laptop comes with 16GB of RAM. No single program uses even half of that.

> The problem is CPU and RAM usage are fundamentally different. I don't get to know what will be run with my program, so I don't get to know how restricted RAM is.

You'd think that, but that's not the case. I admit that I only recently started thinking deeply about this, thanks to some conversations with a colleague who's one of the world's leading experts on memory management, and it was so eye-opening that I gave a talk about this at the recent Java One (because my colleague wasn't available). There are two sides to this:

1. On the demand side, the key is that the use of RAM necessitates the use of CPU (and vice versa): writing and reading to/from RAM requires CPU, but also we write to RAM only when we expect the program to read it in the future. This means that any CPU we use, takes away the ability of another program to use some RAM (because using RAM requires CPU). To give the basic intuition for this, I mentioned the extrme example of a program that uses 100% of CPU. Such a program effectively captures 100% of RAM no matter how much of it it actually uses because no other program can use any RAM, as no other program has the CPU available to access it. You don't need to know anything about what other programs do. Another way to think about this is that the machine is spent whenever the first of RAM and CPU is exhausted.

2. On the supply side, RAM and CPU - whether in metal or in virtualised hardware - are effectively sold as a package (it's hard to get less than 1GB per core, except on embedded devices these days). Furthermore, both moving GCs and (to a far lesser extent) memory allocators can trade RAM and CPU (sophisticated memory allocators aren't quick to return RAM to the OS and maintain internal buffers).

So even though it is true that different programs may have different CPU/RAM usage patterns, you have to think about the ratio rather than CPU and RAM in isolation, and try to achieve some approximate balance. To put it simply, if a program uses a lot of CPU it doesn't make sense for it to use little RAM, because by using a lot of CPU it is effectively depriving other programs of their ability to use RAM (as that requires CPU). There are some exceptions, such as large caches, but the tradeoffs there are very different and too complicated to go into here (I did cover that in my talk).

> It's purely a dev experience decision because people don't know how to ship desktop apps. It's pure waste for the user.

No. I mean, some of it is probably waste, but:

1. What you call "dev experience" also affects the user because it directly impacts the cost of software. Users want cheaper software.

2. More relevant to this particular discussion is what else the user could do. Having "more programs open" isn't a problem thanks to SSDs and virtual memory. So we're talking about programs that are actively using the CPU for something, and they, too, need a balance of the RAM/CPU ratio.

I'm not trying to be dogmatic in the other direction and assert that Electron is necessarily the best tradeoff. But I'm saying that efficiency is ultimately about money that is spent on a combination of RAM, CPU, and software, and when you look at the full picture you see that it's more complicated than it seems. It's not that the software industry has decided to waste users' money. If it did, there would be a competitive edge to programs that use less RAM, but we don't see that competitive edge. What we do see is a few people on HN saying how they simply can't live with VS Code's 50ms keystroke latency and how amazing is some other editor with only 20ms latency that's likely to go out of business soon [1]. The people who made these decisions aren't some early-career developers who just like hot code reloading or some such.

[1] Yes, I do think Rust is more hardware-efficient than JS, but here I'm looking at an even bigger picture. And yes, if you rewrite from JS to C++/Java/Rust/Go you can win on hardware, but as I said at the very beginning, any such rewrite is not really "an optimisation".


> To put it simply, if a program uses a lot of CPU it doesn't make sense for it to use little RAM

The fallacy in that reasoning is that a program that's using a lot of CPU (especially if it's a huge MLOC-sized app) is most likely using up its CPU on memory throughput, not pure number-crunching compute! So at least for the enterprise app case (not pure number crunching), you'd actually need a tunable tradeoff between memory throughput and total RAM footprint, and adopting copying/moving GC's just doesn't give you that. Collections cycles are a huge burden on memory throughput: thus, indirectly, on the very thing you're calling "CPU". The theoretical prospect of winning by forgoing collections cycles outright (pure bump arena allocation) is explicitly excluded here since we're talking about long-running programs that will at some point need to garbage collect.

Heap allocation may have marginally higher "CPU" use in the pure compute sense, but that's exactly the kind of CPU use that does trade off successfully with a lower RAM footprint.

Similarly, non-moving concurrent garbage collectors like Go's also successfully navigate this tradeoff compared to moving/copying collectors, because their collection work, while compute- and to some extent memory-traffic intensive (though less so than if copying/moving memory was involved!) can be largely (though not completely - some minor compute overhead on the hot path is still present) shunted off to a lower-priority background thread.

On the other side of the tradeoff, arenas and caches increase memory footprint in a way that's low-impact on memory throughput (unlike pervasive use of a copying/moving GC) because only live data is accessed as needed, and deallocating the arena is a single operation. The tradeoff is actually highly favorable to low-level languages, which commonly use arenas to manage challenges with heap allocation such as fragmentation.


I feel like this has been a great discussion but all the good technical talk is tapering off. There's more rhetorical semantics now than anything. But I appreciate you teaching me. I'm a bit tired in this reply.

I wish I could find you a few reports from people on here basically renouncing Java because they could not optimize it any further after 20 years programming in it, and moving to Rust. I'd be curious what you'd think.

> You say "dev experience" as if it's some quality-of-life thing. They traded off a cheap resource, RAM, for an eternal maintenance and evolution cost that would only grow higher as the program grows

No, I say dev experience because I brought it up earlier, and staked my claim on it. Because it's an umbrella term that covers nice-to-haves and how ergonomic the language and ecosystem are. The antithesis would be lots of repetitive plumbing that slows down feature release. It's one part of the triangle. They now are shipping way behind but wound up with a product that is just as fast and uses less RAM. That kind of decision can matter to other projects. Smaller projects likely wouldn't have such a slow turnaround.

Again, we're not advocating for everyone to do rewrites. Threads like this are people begging app developers to stop using stuff not appropriate for desktop apps.

> I said that low-level languages can offer good performance in small programs, but many desktop apps aren't small. Claude Code's CLI is over

Err, Claude Code is very special, yes. Couldn't tell you why that tool needs to use that much, but most desktop apps don't. They are built on vendor code and keep the actual app code small, and that makes them excellent candidates for what we're talking about.

> 3MLOC is quite typical. People who work on smaller software are overrepresented in Silicon Valley and, I'm guessing, among HN readers, but they're the outliers

I don't agree, unless you have some stats I don't know about. I mean I'm really jealous that you even know someone that worked on an app of that size. Most people are coding for one of the millions of mid sized businesses dotted all over the country. They are on 20 year old code bases that make great revenue. Everyone there is nice and meets with you weekly. The devs answer to the clients directly. They don't really need to grow their business endlessly, but there's lots of maintenance to do. I've worked with Healthcare companies, they were certainly not 3M lines of code. What you're describing is an extremely narrow class of software that most people will never touch. But I think it sounds cool.

> If anything, benchmarks are much more anecdotal

Not really. I concede they only measure what they do, and it might not be much, but at least they measure something, and it's public and reproducible. Anecdotes are vague stories that are impossible to evaluate. I had an anecdote of someone saying they can't use Java anymore because, even with 20 years of experience, they cannot optimize Java any further for what they need. They rewrote it in Rust. It works much better now, and it's not even close. What am I to make of that?

> Her computer is slow not because shes using a program that eats up a lot of RAM, but because she's inadvertently running a lot of stuff in the background that shouldn't be running at all

There are absolutely apps that run 8gb of RAM. And pageswaps are not good, even with an SSD. They're a real problem.

I just find it lame to tell people this when we could ship leaner apps and it wouldn't even be that hard. It's 2026 and people should be able to have whatever open windows they want. Even 8gb of Ram is a lot, we've just forgotten about it. Shit, my web browser uses 4GB ram idle.

> So even though it is true that different programs may have different CPU/RAM usage patterns, you have to think about the ratio rather than CPU and RAM in isolation, and try to achieve some approximate balance. To put it simply, if a program uses a lot of CPU it doesn't make sense for it to use little RAM, because by using a lot of CPU it is effectively depriving other programs of their ability to use RAM (as that requires CPU). There are some exceptions, such as large caches, but the tradeoffs there are very different and too complicated to go into here (I did cover that in my talk).

That's a cool realization. But I think it's slippery. Only in extreme scenarios will your CPU actually block RAM. The 100% usage scenario makes sense. But most of the time, your CPU is going to be underutilized and capable of letting every app use RAM freely. Obviously the more direct problem would be someone's RAM was sucked up by different apps.

> It's not that the software industry has decided to waste users' money. If it did, there would be a competitive edge to programs that use less RAM, but we don't see that competitive edge. What we do see is a few people on HN saying how they simply can't live with VS Code's 50ms keystroke latency and how amazing is some other editor with only 20ms latency that's likely to go out of business soon

How? If you're forced to use work software, you have no competition to go to. Same for your music app, your social network, your team's chat tool. The "competitive edge" argument requires users to actually have a choice, and for most desktop software they don't they use what their employer, school, or social network has standardized on. Where users do have free choice, they gravitate toward leaner options constantly. Sublime kept paying customers against free Electron alternatives. Mobile platforms enforce resource discipline and have no Electron equivalent. The competitive edge for leanness exists; it just can't express itself when ecosystem effects lock users in.

So I still feel strongly there are cases where you could make sufficiently small programs in Rust that wouldn't devolve into spaghetti. That would give you great performance and ram usage. I still want to find the example of my anecdote, but I'm tired.


Only if the software is optimised for either in the first place.

Ton of software out there where optimisation of both memory and cpu has been pushed to the side because development hours is more costly than a bit of extra resource usage.


That'll stay true for consumer software, because the cost for extra resource usage is not borne by the development house.

The tradeoff has almost exclusively been development time vs resource efficiency. Very few devs are graced with enough time to optimize something to the point of dealing with theoretical tradeoff balances of near optimal implementations.

That's fine, but I was responding to a comment that said that RAM prices would put pressure to optimise footprint. Optimising footprint could often lead to wasting more CPU, even if your starting point was optimising for neither.

My response was that I disagree with this conclusion that something like "pressure to optimize RAM implies another hardware tradeoff" is the primary thing which will give, not that I'm changing the premise.

Pressure to optimize can more often imply just setting aside work to make the program be nearer to being limited by algorithmic bounds rather than doing what was quickest to implement and not caring about any of it. Having the same amount of time, replacing bloated abstractions with something more lightweight overall usually nets more memory gains than trying to tune something heavy to use less RAM at the expense of more CPU.


You're thinking an algorithmic tradeoff, but this is an abstraction tradeoff.

Some of the algorithms are built deep into the runtime. E.g. languages that rely on malloc/free allocators (which require maintaining free lists) are making a pretty significnant tradoff of wasting CPU to save on RAM as opposed to languages using moving collectors.

Free lists aren't expensive for most usage patterns. For cases where they are we've got stuff like arena allocators. Meanwhile GC is hardly cheap.

Of course memory safety has a quality all its own.


hopefully not implying needing a gc for memory safety...

Yeah, there's always Fil-C (Rust isn't memory safe in practice).

> Free lists aren't expensive for most usage patterns.

Whatever little CPU they waste is often worth more than the RAM they save.

> For cases where they are we've got stuff like arena allocators.

... that work by using more RAM to save on CPU.


GC burns far more CPU cycles. Meanwhile I'm not sure where you got this idea about the value of CPU cycles relative to RAM. Most tasks stall on IO. Those that don't typically stall on either memory bandwidth or latency. Meanwhile CPU bound tasks typically don't perform allocations and if forced avoid the heap like the plague.

> GC burns far more CPU cycles

Far less for moving collectors. That's why they're used: to reduce the overhead of malloc/free based memory management. The whole point of moving collectors is that they can make the CPU cost of memory management arbitrarily low, even lower than stack allocation. In practice it's more complicated, but the principle stands.

The reason some programs "avoid the heap like the plague" is because their memory management is CPU-inefficient (as in the case of malloc/free allocators).

> Meanwhile I'm not sure where you got this idea about the value of CPU cycles relative to RAM

There is a fundamental relationship between CPU and RAM. As we learn in basic complexity theory, the power of what can be computed depends on how much memory an algorithm can use. On the flip side, using memory and managing memory requires CPU.

To get the most basic intuition, let's look at an extreme example. Consider a machine with 1 GB of free RAM and two programs that compute the same thing and consume 100% CPU for their duration. One uses 80MB of RAM and runs for 100s; the other uses 800MB of RAM and runs for 99s (perhaps thanks to a moving collector). Which is more efficient? It may seem that we need to compare the value of 1% CPU reduction vs a 10x increase in RAM consumption, but that's not necessary. The second program is more efficient. Why? Because when a program consumes 100% of the CPU, no other program can make use of any RAM, and so both programs effectively capture all 1GB, only the second program captures it for one second less.

This scales even to cases when the CPU consumption is less than 100% CPU, as the important thing to realise is that the two resources are coupled. The thing that needs to be optimised isn't CPU and RAM separately, but the RAM/CPU ratio. A program can be less efficient by using too little RAM if using more RAM can reduce its CPU consumption to get the right ratio (e.g. by using a moving collector) and vice versa.


There are (at least) two glaring issues with your analysis. First, the vast majority of workloads don't block on CPU (as I previously pointed out) and when they do they almost never do heap allocations in the hot path (again, as I previously pointed out). Second, we don't use single core single thread machines these days. Most workloads block on IO or memory access; the CPU pipeline is out of order and we have SMT for precisely this reason.

Anyway I'm not at all inclined to blindly believe your claim that malloc/free is particularly expensive relative to various GC algorithms. At present I believe the opposite (that malloc/free is quite cheap) but I'm open to the possibility that I'm misinformed about that. You're going to need to link to reputable benchmarks if you expect me to accept the efficiency claim, but even then that wouldn't convince me that any extra CPU cycles were actually an issue for the reasons articulated in the preceding paragraph.


> There are (at least) two glaring issues with your analysis. First, the vast majority of workloads don't block on CPU (as I previously pointed out) and when they do they almost never do heap allocations in the hot path (again, as I previously pointed out). Second, we don't use single core single thread machines these days. Most workloads block on IO or memory access; the CPU pipeline is out of order and we have SMT for precisely this reason.

This doesn't matter because if you're running a single program on a machine, it might as well use all the CPU and all the RAM. As long as you're under 100% on both, you're good. But we want to utilise the hardware well because we typically want to run multiple programs (or VMs) on a single machine, and the machine is exhausted when the first of CPU or RAM is exhausted. So the question is how should your CPU and RAM usage be balanced to offer optimal utilisation given that the machine is spent when the first of CPU and RAM is spent. E.g. you can only run two programs, each using 50% of CPU; if they each use only 5% of RAM, you've saved nothing as no third program can run. So if you spend either one of these resources in an unbalanced way, you're not using your hardware optimally. Using 2% more CPU to save 200MB of RAM could be suboptimal.

I'm not saying that for every program that uses X% CPU should also use exactly X% of RAM or it must be wasting one or the other, but that's the general perspective of how to think about efficiency. Using a lot of one and little of the other is, broadly speaking, not very efficient.

> Anyway I'm not at all inclined to blindly believe your claim that malloc/free is particularly expensive relative to various GC algorithms. At present I believe the opposite (that malloc/free is quite cheap) but I'm open to the possibility that I'm misinformed about that.

You are.

> You're going to need to link to reputable benchmarks if you expect me to accept the efficiency claim, but even then that wouldn't convince me that any extra CPU cycles were actually an issue for the reasons articulated in the preceding paragraph.

I don't believe there are any reputable benchmarks of full applications (which is where memory-management matters) that are apples-to-apples. I'm speaking from over two decades of experience with C++ and Java.

The important property of moving collectors is that they give you a knob that allows you to turn RAM into CPU and vice-versa (to some extent), and that's what you want to achieve the efficient balance.


Moving collectors as generally used are a huge waste of memory throughput, and this shows up consistently in the performance measurements. Moving data is very expensive! The whole point of ownership tracking in programming languages is so that large chunks of "owned" data can just stay put until freed, and only the owning handle (which is tiny) needs to move around. Most GC programming languages do a terrible job of supporting that pattern.

That's just not true. To give you a few pieces of the picture, moving collectors move little memory and do so rarely (relative to the allocation rate):

In the young generation, few objects survive and so few are moved (the very few that survive longer are moved into the old gen); in the old generation, most objects survive, but the allocation rate is so low that moving them is rare (although the memory management technique in the old gen doesn't matter as much precisely because the allocation rate is so low, so whether you want a moving algorithm or not in the old gen is less about speed and more about other concerns).

On top of that, the general principle of moving collectors (and why in theory they're cheaper than stack allocation) is that the cost of the overall work of moving memory is roughly constant for a specific workload, but its frequency can be made as low as you want by using more RAM.

The reason moving collectors are used in the first place is to reduce the high overhead of malloc/free allocators.

Anyway, the general point I was making above is that a machine is exhausted not when both CPU and RAM are exhausted, but when one of them is. Efficient hardware utilisation is when the program strikes some good balance between them. There's not much point to reducing RAM footprint when CPU utilisation is high or reducing CPU consumption when RAM consumption is high. Using much of one and little of the other is wasteful when you can reduce the higher one by increasing the other. Moving collectors give you a convenient knob to do that: if a program consumes a lot of CPU and little RAM, you can increase the heap and turn some RAM into CPU and vice versa.


Or just using less electron and writing less shit code.

Though sadly the new types of Kinds require a method of extracting Kindles to PDF which is an order of magnitude harder than the old Calibre DeDRM method. I had to boot Bluestacks and export license files and rub my tummy and pat my head and do the Hokey Pokey… but in the end, the books are now 100% mine.

Edit: It’s been a while. Looks like the process is more streamlined, but still not what it used to be.


Harder, for sure. But you just need one copy in the wild...

I’ve had a pair of Nook Simple Touch for over ten years and they are wonderful for PDFs. Stored 100% offline. Good for prepper books.

What are “Anna’s”?


I honestly had no idea they’re so libertarian-capitalist. I figured it was government-led, government-run.

Could someone please explain the minute hand? It says it’s Nine : Twenty-nine but the minute hand is pointing at the word Twelve.

Arrange all sixty minutes alphabetically around the clock. Same for seconds. Twenty-nine is near the end of the alphabet.

The labels only relevant to the hours. For some reason the hour labels don’t align well to where the hour hand is.


THAT explains it, thank you.

I tweaked the labels when I was working on the combined mode, where I figured they make more sense as indicative of the general, combined alphabetical area. I hadn't considered they'd be confusing in the 3-hand mode. I've split the positioning of them between the two now. Appreciate the feedback! It does make it _less_ accursèd now, which I feel could be either an improvement or a regression. I leave it to weigh on your consciences.

Check the "show all labels" to make it correct. The big labels are incorrect, the "twelve" should be "twenty-five".

I think the labels are pointlessly confusing.

I mean to be fair the entire thing is pointlessly confusing.

Maybe, but the labels and hour markers that contradict the meaning of the hand positions is just perverse :-)

I have changed it now (see another comment above.) But now it is less accursèd! Ah well.

Hmm. I wonder what it would look like if you added the corresponding "minute" labels (eight, five, four, etc) at the appropriate places. It might make it at least a little feasible to read the time!

For inspiration: https://www.alamy.com/clock-face-hour-dial-with-numbers-dash...


Yeah, might add this as a toggle. Seems to be the bit that people ask about.

Be sure to inform the author of the article who is currently making money on his 1GB VPS that he hasn’t found a happy medium


Saving 15 USD on 0 USD MMR while still building the business is priceless. Virtually infinite runway.


Only if your time is worthless and someone else is paying your living expenses.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: