Sorry this is long, but you successfully nerd-sniped me :)
> Other than C, Rust, Go, Swift? C# can use value types, Java cannot. So famously that Project Valhalla has been highly anticipated for a long time. Obviously the JVM team thinks this is a gap and want to address it. That is enough in itself to make someone consider a different language.
As someone working on the JVM, I can tell you we're very much interested in Valhalla and largely for cache-friendliness reasons, but Java certainly doesn't box every value today, and you are severely overstating the case. If you think you can save on both RAM and CPU by preferring a low-level language (or Go, which is slower almost across the board), you're just wrong. But I want to focus on the more important general point you made first.
> My feeling, and the feeling of most people, is that dev experience has been so heavily prioritized that we now have abstractions upon abstractions upon abstractions, and software that does the same thing 20 years ago is somehow leaner than the software we have today. The narrow claim "within a fixed design, reducing RAM often costs CPU," is true.
The problem here is that in some situations there's truth to what you're saying, but in others, it is just seriously wrong. I think the misconception comes precisely because "most poeple" these days don't have the long experience with low level programming that people in my generation of developers do, and you're not aware that many of these abstractions are performance optimisations that come from deep familiarity with the performance issues of low-level programming (I started out programming in C and X86 Assembly, and in the first long job of my career I was working on hard- and soft-realtime radar and air traffic control systems in C++).
Low-level languages aren't meant to be fast (and aren't particularly fast). They're meant to give you direct control over the use of hardware. When it comes to small software, this control does frequently translate to very good performance, but as programs get larger, it makes low-level languages slow. It is true that Java was intended to help developer productivity, but it's also meant to solve some of the intrinsic performance issues in low-level languages, which it does rather well. After all, our team has been made up of some of the world's biggest experts in optimising compilers and memory management, and removing some of C++'s overheads is very much a central goal.
So where do things go wrong for low-level languages? The core problem is that these languages split constructs into fast and slow variants, e.g. static vs dynamic dispatch and stack vs heap allocation. The programmer needs to choose between them. What happens is:
1. As programs grow larger and more complex, the direction is almost completely monotonical in the direction of the more expensive, and more general, variants.
2. There is a big difference between "a fast program could hypothetically be written" and "your program will be fast". Getting good perfomance out of low-level languages requires not only experience, but a lot of effort. For example, you can write a small benchmark and see that malloc/free are pretty fast these days, but that's often true only for the benchmark, where objects tend to be of the same size, and their allocation and deallocation patterns are regular. Memory allocators degrade over time, and they're quite bad when patterns are irregular, which is what happens in real programs, especially large ones. There's also the question of meticulous care around correctness. When Rust first came out I was very excited to see a few important correctness issues solved without loss of control, but was then severely disappointed. Almost anything that is interesting from a performance perspective for us low-level programmers requires unsafe. Even a good hashmap requires unsafe. The performance cost of safety in Rust is higher than it is in Java, and non-experts end up writing slower programs (when they're not small at least).
Such performance issues have plagued low-level programming forever, and Java is reducing these overheads. The idea that high abstractions can improve performance was possibly first stated in Andrew Appel's paper, "Garbage Collection Can Be Faster than Stack Allocation" in the eighties, in which he wrote: "It is easy to believe that one must pay a price in efficiency for this ease in programming... But this is simply not true."
Instead of a static/dynamic dispatch split, Java offers only the general construct (dynamic), and the compiler can "see through" dynamic dispatch and inline it better than any low-level compiler ever could. You can say that surely there has to be some tradeoff, and there is, but not to peak performance. The tradeoff is that 1. you lose control and can't guarantee that the optimisation will be made, so you get good average performance but maybe not the best worst-case performance (which is why it's not hard to beat Java in small programs if you know what you're doing), 2. the compiler needs to collect profiles as the program runs, which results in a "warmup" period.
(If, like me, you like Zig, you might have seen Kelley talk about the "vtable barrier" in low level languages; this doesn't exist in Java. You may also be interested in this talk, "How the JVM Optimizes Generic Code - A Deep Dive", by John Rose: https://youtu.be/J4O5h3xpIY8,
As for memory, not only do moving collectors do not degrade (or fragment) over time, they can use the RAM chip as a hardware accelerator. Unfortunately, when a program uses the GPU for acceleration it's considered clever, but when it uses the RAM chip for accelaration it's considered bloated, even though every CPU core these days comes with at least 1GB of RAM that you might as well use if you're using up the core, as that's effectively free.
The people who consider that bloated are mostly those who haven't struggled with low-level programming long enough or on software that's large enough (they're people who say, I wrote this lean and fast gizmo by myself in 5 months; 99% of value delivered by software is in software written by large teams and maintained over many years). When I was working on a sensor-fusion and air-traffic control software in the nineties, it wasn't "lean"; we just had no choice. We constantly had to sacrifice performance for correctness. Of course, once machines got better, we switched to Java for better performance. God could have written a faster program of that size in C++, but not a large team made up of people with different levels of experience. People who think C++ (or Rust) is particularly efficient are people who haven't written anything big and long-maintained with it.
In conclusion:
1. Sometimes layers of abstractions add performance overheads, and sometimes they remove it. It is not generally true that more abstraction/generality have a performance cost, especially when comparing different languages, although it is almost always true within one language (e.g. dyanmic dispatch is never faster than static dispatch, and is often slower, in C++, but dynamic dispatch in Java can be faster than even static dispatch in C++, and the tradeoffs are elsewhere). If you didn't believe that, you'd be writing all your code in Assembly (which is what I did to get the fastest programs in the early nineties, but it's just not generally faster today thanks to good optimisation algorithms in compilers).
2. Low-level languages give you control, not speed. This control typically translates to better performance in small programs and to worse performance in large ones. This performance problem is intrinsic to low-level programming.
> Removing boxing can improve layout, footprint, and CPU utilization simultaneously. That would lie outside the framework "You can't improve one without harming the other."
First, the footprint won't reduce by much. E.g., in Java, boxing could cost you 10% of your footprint, but the RAM-assisted acceleration could be 80% of the footprint.
Second, yes good layouts help CPU utilisation, but today you can't get that without giving up on other things that harm performance. Dynamic dispatch and memory management in C++ and Rust are just too slow, and while Zig can be blazing fast, it's not easy to write large software in it without compromising performance any more than in any other low-level language. I hope that with Valhalla, Java will be the first language to let you enjoy everything at once, but it's not really an option today.
> I'm saying Electron uses a lot of RAM and it has nothing to do with offloading work from the CPU, and everything to do with taking the most brute force approach to cross app deployment that we possibly can.
That developers choose it because it's "brute force approach to cross app deployment" doesn't necessarily mean that it doesn't also offload work from the CPU, but yes, Electron apps are probably very inefficient from some perspectives. But I think this is also overstated by people who are overly sensitive. When we say something is inefficient, it means that we spend on it more than we have to, but what we really mean is that we could spend that resource that we save on something else instead. On my M1 laptop, I comfortably run three electron apps and two browsers simultaneously without much harming the speed at I can, say, compile HotSpot, probably because SSDs are fast enough for virtual memory in interactive GUIs. I can't think of anything else I could use my laptop's resources for if the apps were leaner on RAM. Reducing the consumption of a resource that can't be meaningfully used for other work isn't real efficiency, and if it comes at the expense of anything useful, it's downright inefficient.
Well I'm glad you were nerd sniped, I appreciate the response. I've learned a lot and it's a good resource for people. I know you're an expert here. Most of the conversation for me has been clarifying my confusion based on my model of programming. There are parts that are way out of depth for me, but I'm trying to focus on what I do understand, and I'm still greatly confused on some of your claims.
I understand the JVM is not only very efficient, but the JIT gives it unique opportunities to optimize where a compiled language couldn't. You may not get those optimizations consistently, but you don't necessarily need to go into that level of minutia.
You also pointed out that these JIT characteristics can be easily gamed against Java in microbenchmarks, so it's not difficult to make Java look slower than it is in a complex application.
That being said, I am not understanding this narrative that low level projects, as they grow, always devolve into an inefficient dynamic soup. The Linux kernel is millions of lines and uses function pointers sparingly and deliberately. SQLite is huge, mature, and almost entirely static. High-frequency trading systems, embedded software, browser rendering engines, database storage layers. There are entire industries of large, long-lived, performance-critical codebases that do not "devolve" into dynamic dispatch.
If you're saying it's just hard to do that, and Java makes it easy to get close enough with its already dynamic model, then fine. But if you're saying this is an inherent problem as low level programs grow, I would like to understand why.
But you also said
> If you think you can save on both RAM and CPU by preferring a low-level language (or Go, which is slower almost across the board), you're just wrong
Really? Ignoring gamed benchmarks, I don't think it's controversial to say Rust consistently beats Java at the same tasks in RAM and CPU. Maybe that's not important to you because they are too small and you're talking about what happens to programs as they grow in complexity. So I'd like to hear more about why you wrote I'm wrong.
> Second, yes good layouts help CPU utilisation, but today you can't get that without giving up on other things that harm performance
Like what? I'm not understanding. You seem to be implying that without boxing we'd be stuck with a lot of dynamic dispatch and fragmented memory, and I'm not seein the connection.
I brought up unboxing because pointer chasing is expensive and trying to make a collection in Java that you can efficiently loop through can be a frustrating thing.
? Does it not box
> Electron apps are probably very inefficient from some perspectives. But I think this is also overstated by people who are overly sensitive
I also have an m1 laptop and can run things fine. But I'm probably not going to budge on that, because I am consistently exposed to people with low RAM systems, and they are forced to use stuff like Teams in their day to day. Yes, I understand it's cross platform and saves on dev time. Nobody likes using WinForms. But I think Electron has been a net negative on the ecosystem of apps for people with ok computers.
> That being said, I am not understanding this narrative that low level projects, as they grow, always devolve into an inefficient dynamic soup. The Linux kernel is millions of lines and uses function pointers sparingly and deliberately.
Low-level languages are designed for direct and complete control over hardware, and that is also the job of an OS kernel. Their level of abstraction is a perfect match. But the things at which low-level languages are slow - heap allocations and dynamic dispatch - are exactly the things that applications (not kernels) naturally gravitate towards needing over time.
Of course, it's possible to keep redesigning the architecture as the software evolves to avoid low-level languages' slow operations, but that costs a lot. This isn't some new discovery. The motivations for Java's bet on a JIT and moving collectors were a result of seeing what happened with C++: it was very easy to write nice-looking and fast programs. It was very hard and very costly to keep them that way over time.
> SQLite is huge, mature, and almost entirely static
SQLite is not only not huge but is quite small. ~150KLOC.
> I don't think it's controversial to say Rust consistently beats Java at the same tasks in RAM and CPU.
I don't know if it's controversial, but it's certainly very wrong.
Let's look at one of the most famous terrible benchmarks: The Computer Language Benchmarks Game (it's terrible not only because it compares different algorithms, but also because it has no benchmarks that are long-running, none with interesting memory management, and no concurrent benchmarks - the very things most programs today do): https://benchmarksgame-team.pages.debian.net/benchmarksgame/... In all but one, the C++ and Java results are mixed, i.e. some Java entries are faster than some C++ entries and vice-versa, and this is despite the benchmarks penalising JITs and being minuscule, which is where low-level languages shine. This goes to my point about the important difference of "some program can be very fast" vs. "your program will be fast". Low level languages and Java are on different sides of the tradeoff here: low-level languages focus on control, which often means "someone could write fast code", while Java focuses on compiler and runtime optimisations of high abstractions with the goal of making your code fast.
If we look at another famous benchmark, techempower, we see the same thing: Java, Rust, and C++ results are intermixed, despite the benchmarks being small and thus favouring low-level languages: https://www.techempower.com/benchmarks/#section=data-r23
Of course, there aren't cross-language application benchmarks, i.e. benchmarks that measure the performance developers really care about. All I can say is that a developer of one of the world's largest tech companies told us that his new team lead wanted to migrate some service from Java to Rust for the performance. What happened was that they experienced a large drop in performance, but to save face, they spent 6-12 months carefully optimising the Rust code, and in the end managed to match, though not exceed, Java's performance.
C++ and Rust are simply not particularly fast for applications, and Java is. It's possible to spend a lot of effort optimising them, but it's effort that needs to be spent continuously as the program evolves. That's exactly what led compilation and memory management experts to design the JVM the way they did in the first place: It's hard to make low-level code efficient for large applications.
> I mean Java and Go are pretty much neck and neck here, with Go using way less RAM
Go uses way less RAM because it uses an inefficient non-moving collector, which is why you see Go shops complaining constantly about the poor performance of Go's GC and why they try to avoid it (as Java developers used to do in the past). The speed is similar only because the benchmarks are not very interesting, but while, broadly speaking, C++, Java, and Rust are roughly at the same time "performance level" (ignoring all the tradeoffs I mentioned before), Go is strictly in a lower class. While you have to get pretty large to see Java beating C++ and Rust, it's fairly easy to see Java leaving Go in the dust even on fairly small programs. The programs just need to be a little more interesting than those in the Benchmarks Game.
But I don't think Go is even playing the same game. Its goal wasn't to be a super-optimised language that takes advantage of progress in compilation and memory management technologies. It was meant to be good enough for some things while keeping a small and simple implementation. It's faster than Python and JS, and that's the goal. It's not really trying to compete with C++/Java on performance.
> Like what? I'm not understanding. You seem to be implying that without boxing we'd be stuck with a lot of dynamic dispatch and fragmented memory, and I'm not seein the connection.
I'm saying that the languages that give you good layout today happen to be languages that are bad at other things (like memory management, dynamic dispatch, and concurrent data structures). So if you win in one area you lose in another (but depending on the program, some of these areas may matter more than others).
> Does it not box?
In Java, valus in an int/long/double/etc. array or fields in a class like `class A { int a, b; boolean c; String d; }` are just as boxed as they are in C++, which is to say they're not. Instances of the class will not be flattened into arrays or fields, which is exactly why we have Valhalla, but the problem is not that severe in big program (which is why we haven't dropped everything to just do Valhalla). Also, remember that boxing has a cost in low-level languages beyond cache-locality - due to heap allocations - that don't exist (at least not as significantly) in Java. Boxing in Java is much cheaper than it is in C++/Rust, except fot the cache locality cost, but while in some programs that can be a problem, in many that's not the main one.
> I also have an m1 laptop and can run things fine. But I'm probably not going to budge on that, because I am consistently exposed to people with low RAM systems, and they are forced to use stuff like Teams in their day to day
Of course if you deploy a program that uses a lot of resource X to machines where X is more restricted than the other resources the program uses, you should optimise the consumption of X.
> But I think Electron has been a net negative on the ecosystem of apps for people with ok computers.
That depends on what else these people want to use their computers for while running an Electron app. By far the largest group of people I've seen complain are people here on HN who like counting MBs rather than look at the overall utilisation picture.
> The Computer Language Benchmarks Game (it's terrible not only because it compares different algorithms, but also because it has no benchmarks that are long-running, none with interesting memory management, and no concurrent benchmarks - the very things most programs today do)
It also compares un-optimised single-thread #8 programs transliterated line-by-line from the same original.
However long (programs run) they never seem to become "long-running".
There's always some programmer who replaces "interesting memory management" with array and int.(Many complaints about Go binary-trees programs seemed to be: they should implement a custom arena.)
What does "no concurrent benchmarks" mean when:
import java.util.concurrent.CyclicBarrier;
> Of course, there aren't cross-language application benchmarks
> However long (programs run) they never seem to become "long-running".
Most application servers are expected to run without issue for at least a day. Our acceptance tests run high workloads for 1, 7, and 30 days. The longest running Benchmarks Game benchmark doesn't break one minute. You can maybe argue whether long running is 3 hours or 3 days, but under one minute isn't long running by anyone's definition.
> What does "no concurrent benchmarks" mean when: import java.util.concurrent.CyclicBarrier;
I believe it's used to coordinate parallelism. Parallelism (where tasks cooperate) and concurrency (where they compete) result in completely different machine workloads.
It's obviously more interesting than the benchmarks game as it exercises things in a more realistic way, but as much as I like seeing Java winning as it did in this benchmark [1] (even an ancient version of Java, before the new GC generations and new compiler optimisations) it's still very small, and as a batch program, not very representative of most software people write.
The problem with benchmarks is that they tell you how fast a specific program is (the benchmark itself) but it's very hard to generalise from that result to what you're interested in, unless the benchmark is very similar to your program (microbenchmarks never are; larger benchmarks could be, but the space is large so you need to be lucky).
[1]: It's interesting that they made a common mistake when interpreting the results. The program seems to try to get the CPU to 100%. In this situation it's not hard to see that a program that runs even 1% faster and uses 10x more memory is more memory efficient than a program that's 1% slower and uses 10x less memory. That's because while a program runs at 100% CPU, no RAM can be used for any purpose by any other program. So either way you capture 100% of RAM, but in one case you capture it for less time. This idea is at the core of using RAM chips as hardware accelerators (using up CPU effectively uses up RAM because using RAM requires CPU cycles).
JavaOne long ago, there would be mixed messages: both "So a benchmark that ends in less than 10 sec probably does not measure anything interesting." and in blog post benchmarks "100000000 hashes in 5.745 secs … 100000000 primes in 1.548 secs"
(Goldilocks would know.)
> … different machine workloads…
I'm happy to accept that you didn't mean no parallel programs.
I didn't say that short-running benchmarks don't measure anything interesting, only that they don't say much about long running programs, where the same mechanisms can exhibit very different behaviour.
Seems like the benchmarks game didn't say that anything interesting about long running programs was measured? And didn't say that "interesting" memory management was measured. And didn't say…
I suppose when you write "because it compares different algorithms" you didn't say that there were no comparisons based on the same algorithm.
We've certainly not attempted to prove that these measurements, of a few tiny programs, are somehow representative of the performance of any real-world applications — not known — and in-any-case Benchmarks are a crock.
The problem with benchmarks isn't that they themselves are lying. Benchmarks always tell the truth - about themselves. The problem is in the conclusions people draw from them. In the nineties benchmarks were still a little extrapolatable because we could say X is slow and Y is fast, as many operations had an intrinsic cost. These days, almost no benchmark (certainly microbenchmark) is extrapolatable to anything beside itself. Is a branch slow or fast? That depends on what the program did before and what it intends to do later. Is memory access slow or fast? Ditto. Function call? Allocation? They're all so context-dependent now that the only use of benchmarks of some mechanism is for the authors of the mechanism who know exactly how it works, what exactly is being measured, and what can be extrapolated from that.
If I write a malloc benchmark I may think, oh, this measures the cost of malloc/free. In reality, it only measures the cost for a program whose concurrency, allocation/deallocation patterns, and duration match exactly what I wrote, and bear little resemblance to the numbers I'd get if any of those were different.
So I'm not saying that the Benchmark Game is lying. It is telling the truth about how long those programs ran. It's just that what we can generalise from those benchmarks is even less than what we can from more "interesting" ones, but given that even that is close to nothing anyway, maybe it doesn't matter.
All benchmarks tell the truth about themselves. That has never been what makes benchmarks good or bad. The worst and best benchmarks ever made are both truthful about their results.
But a good benchmark suite is one that covers a variety of different problems and/or programs similar to a significant portion of production software. The Benchmark Game is neither, plus it's confusing because it often compare things that measure the sophistication of the algorithm while making it seem it measures something about a language (you don't need to be deceitful to confuse). So no, I don't think it's a good benchmark suite at all.
I know. I don't understand why you think I have a problem with the site's honesty. It's a poor benchmark suite, and it admits it is. We're in agreement.
> Here's something that could reasonably make those claims
I'm not familiar with this paper, but you seem to think I was complaining about false claims, which I wasn't. Benchmarks are problematic these days because results no longer generalise as they did a couple of decades ago, but some benchmarks are of higher quality than others (again, I'm not talking about what they say they are but about what they actually are) by at least covering a wider and possibly more relevant set of use cases, and by offering comparisons that are less confusing.
I don't understand what you're trying to say. I said that the Benchmark Game is not a good benchmark suite in the sense that it does not measure language speed differences since 1. it compares different algorithms, and 2. it doesn't cover some of the most important use-cases that languages/runtimes optimise for [1]. That's all. I'm not saying it's deceitful, I'm saying it's just not good comparison of language speeds. Are you agreeing or disagreeing?
[1]: In particular, Java was designed to overcome some of the biggest performance issues of low-level languages that has plagued a large number of applications: memory management when objects are of varying sizes and lifetimes, concurrency (especially lock-free data structures), and dynamic dispatch, which grows in use as applications grow in size and complexity. Not a single one of these is covered in the Benchmark Game, which focuses on small, very regular, batch workloads, the very things that low-level languages have always been good at, and none of the areas where the performance of low-level languages has traditionally (and to this day) suffered and which led to different compiler and memory management designs.
> it's just not good comparison of language speeds
It's not that the benchmarks game is not a good benchmark suite, it isn't a benchmark suite.
It's not that the benchmarks game is not a good comparison of language speeds, it's that comparison of "language speeds" is so under-specified as-to-be wishful thinking.
> Java was designed to…
"… build software for the next generation of consumer electronics – think smart toasters, interactive TVs, and other futuristic gadgets." Things change.
>… the very things that low-level languages have always been good at…
Which is why there are people who find those kind-of Java programs being in-any-way comparable, somewhat surprising.
> It's not that the benchmarks game is not a good benchmark suite, it isn't a benchmark suite.
OK, but I was responding to someone who did consider it to be a benchmark suite. As long as we agree it's not a good benchmark suite whatever it considers itself to be, we're in agreement.
> It's not that the benchmarks game is not a good comparison of language speeds, it's that comparison of "language speeds" is so under-specified as-to-be wishful thinking.
With that I completely agree. But if you group results by language, that's exactly what you're inviting, and if your suite of benchmarks or whatever you want to call it covered a wider range of problems, that point could be more easily seen. Let's say that the combination of grouping results by language and covering only a very narrow (and niche) set of problems that also happens to be the sweet spot of some languages that have other significant performance failings in other use cases doesn't exactly help people get the right impression.
> Low-level languages are designed for direct and complete control over hardware, and that is also the job of an OS kernel. Their level of abstraction is a perfect match. But the things at which low-level languages are slow - heap allocations and dynamic dispatch - are exactly the things that applications (not kernels) naturally gravitate towards needing over time.
Hmm. To put a pin in this, you're saying the following is harder to do as an application grows in complexity:
- Avoiding lots of little allocations (using arenas, value types)
- Avoiding dynamic dispatch
Two things that Java doesn't need to avoid, because it's optimized for it. What I'm unclear on is the perspective that it's inevitable. I don't know what scale of apps you're talking about.
In the case of Rust, it has Arenas, stack allocated structs, and generics via monomorphization. Not only can you avoid both of these things, it doesn't even seem that difficult. If you're saying the borrowchecker just becomes too cumbersome to do for sufficiently large applications, that's fine.
> What happened was that they experienced a large drop in performance, but to save face, they spent 6-12 months carefully optimising the Rust code, and in the end managed to match, though not exceed, Java's performance.
There's really not enough detail here to draw from it. But they got to the same performance as Java with less RAM, at the cost of dev experience. Does that not support what I said? Maybe that 6-12 months for the same performance was way too much, but for a smaller app, that could actually be a worthy tradeoff, no? Like...a desktop app?
> Let's look at one of the most famous terrible benchmarks: The Computer Language Benchmarks Game (it's terrible not only because it compares different algorithms, but also because it has no benchmarks that are long-running, none with interesting memory management, and no concurrent benchmarks - the very things most programs today do)
If Benchmark game is not long enough, dynamic enough, or allocate-y enough, then it's not worth talking about. But what is worth talking about?
This is difficult because you won't accept benchmarks that are too small because they are unfair to the JIT, but you also won't accept applications that are too small. Apparently Sqlite, 150k LOC, is not big enough to be relevant to this discussion. So all we have is anecdotal experience that Java is more performant for large, long lived processes with many contributors. I've certainly read a lot of reports from people rewriting to Rust and getting much leaner applications, but maybe that they weren't working on apps complex enough to force them into dynamic dispatch or many allocations. Or maybe it was pure cope. I don't know because their anecdotes and your anecdotes are not very detailed.
But see how far we have drifted from what the original claim was, which is the claim you cannot reduce RAM consumption without harming CPU utilization. You said Rust isn't particularly fast, but Java is. That's a really strong claim considering we have painted a much narrower scope of when that's true. Most of us aren't working on 3M LOC faang apps. We are working on smaller things, or desktop apps. In those cases, the ratio of RAM consumption to CPU efficiency is much better in Rust than it is in Java. Isn't using Java just a straight up RAM loss for those cases?
> That depends on what else these people want to use their computers for while running an Electron app. By far the largest group of people I've seen complain are people here on HN who like counting MBs rather than look at the overall utilisation picture
That doesn't seem terribly fair. Grandma doesn't have the vocabulary to complain about RAM, true. But her computer is slow, she asked her grandson for help, and her grandson told her to use Spotify in the browser, not download the app. And now she has to be mindful of what she has open, even though 8gb of RAM is actually a lot, we've just lost sight of it.
> Of course if you deploy a program that uses a lot of resource X to machines where X is more restricted than the other resources the program uses, you should optimise the consumption of X
The problem is CPU and RAM usage are fundamentally different. I don't get to know what will be run with my program, so I don't get to know how restricted RAM is. If a computer is CPU limited, at the very least, it won't be terribly busy with programs that aren't being used. But for most programs, RAM is allocated and then just sitting there, whether the program is being used or not. So deploying an Electron app kinda feels like a middle finger to your users, because even though it doesn't need to, it limits the amount of programs they can have open. Not to save on CPU, but because Chromium needs that RAM to work in the first place. It's purely a dev experience decision because people don't know how to ship desktop apps. It's pure waste for the user.
> In the case of Rust, it has Arenas, stack allocated structs, and generics via monomorphization. Not only can you avoid both of these things, it doesn't even seem that difficult. If you're saying the borrowchecker just becomes too cumbersome to do for sufficiently large applications, that's fine.
Not really.
First, let's look at stack-allocated structs and think about how much data can live in them. The typical stack size is 2MB, but because the only live data in stacks are in caller functions, we can say that on average, the amount of live data that a stack holds is 1MB. Now, look at how much RAM an application uses in MBs and divide it by the number of threads. Usually, the ratio is much higher than RAM, which means that data in stacks is not a significant portion of the program's data. (Async changes this calculus a bit, but async is extremely limited in Rust as it doesn't allow recursion, FFI, or dynamic dispatch; proper user-mode threads, like the ones in Java and Go actually make stack allocation more useful.)
Now let's look at arenas. Arenas are extremely efficient because they offer a similar RAM/CPU tradeoff knob as moving GCs, but they're not as general (if your allocation pattern supports them well, they're great, but you can't generally use them). But in Rust, things are much worse, because arenas are quite limited; too many standard-library data structures, including strings, vectors, and maps can't easily be plugged into an arena. The only language that gives you arenas' full power (which, again, is not completely general) is Zig. This is one of the reasons hardcore low level programmers find Rust so underwhelming (the other being that too many things that are important in low-level programming, including basic data structures but also benign concurrency, require unsafe).
> But they got to the same performance as Java with less RAM, at the cost of dev experience.
You say "dev experience" as if it's some quality-of-life thing. They traded off a cheap resource, RAM, for an eternal maintenance and evolution cost that would only grow higher as the program grows. And remember that RAM isn't entirely fungible. It's hard to get less than 1GB per core (either in bare metal or in cloud VMs/containers) so using less RAM often saves you $0.
> but for a smaller app, that could actually be a worthy tradeoff, no? Like...a desktop app?
I said that low-level languages can offer good performance in small programs, but many desktop apps aren't small. Claude Code's CLI is over 500KLOC.
> So all we have is anecdotal experience that Java is more performant for large, long lived processes with many contributors
If anything, benchmarks are much more anecdotal. Not only are there fewer benchmarks than applications, but they don't even resemble real programs. But yeah, ever since operations lost their intrinsic costs some 20 years ago - with CPU cache hierarchies, branch prediction, and ILP, more powerful optimising compilers, and more elaborate GCs/memory allocators, the ability to generalise from one program to another is close to nil. So yeah, experience is all we have to go with. Going with the numbers we have (some benchmarks that don't extrapolate) rather than the numbers we need but don't have doesn't help.
I can tell you that the loss of intrinsic operation costs has made our lives as compiler/runtime developers much harder, because we can no longer tell people that this operation is generally fast or generally slow. But that doesn't change the fact that this is our reality.
> what the original claim was, which is the claim you cannot reduce RAM consumption without harming CPU utilization
No. I wrote, and I quote: "Using a lot less RAM often implies using more CPU."
> That's a really strong claim considering we have painted a much narrower scope of when that's true.
Did we? Most of software (measured by the distribution of paid programmers) is in large applications.
> Most of us aren't working on 3M LOC faang apps. We are working on smaller things, or desktop apps
I don't think that's true at all. Forget FAANG. Most software isn't written by software companies at all, but is in-house software (well, Netflix isn't a software company, so I guess it's one FAANG letter). The bulk of software is in things like telecom management and billing, banking and finance, manufacturing control, logistics and shipping, healthcare and hospitality, retail and payment processing, travel, government, defence. 3MLOC is quite typical. People who work on smaller software are overrepresented in Silicon Valley and, I'm guessing, among HN readers, but they're the outliers.
> Grandma doesn't have the vocabulary to complain about RAM, true. But her computer is slow, she asked her grandson for help, and her grandson told her to use Spotify in the browser, not download the app. And now she has to be mindful of what she has open, even though 8gb of RAM is actually a lot, we've just lost sight of it.
Does she, though? I doubt she's running anything intensive in the background, so she's really only using one program at a time, and SSDs are fast enough these days to page in virtual memory when she switches programs, unless the one program she's currently using eats up the 8GB. I agree that if her OS - the one thing she needs to run in the background - is taking up a lot of RAM that could be a problem, but the OS is special. Her computer is slow not because shes using a program that eats up a lot of RAM, but because she's inadvertently running a lot of stuff in the background that shouldn't be running at all (browser plugins? some programs that add themselves as login items?) A Surface Laptop comes with 16GB of RAM. No single program uses even half of that.
> The problem is CPU and RAM usage are fundamentally different. I don't get to know what will be run with my program, so I don't get to know how restricted RAM is.
You'd think that, but that's not the case. I admit that I only recently started thinking deeply about this, thanks to some conversations with a colleague who's one of the world's leading experts on memory management, and it was so eye-opening that I gave a talk about this at the recent Java One (because my colleague wasn't available). There are two sides to this:
1. On the demand side, the key is that the use of RAM necessitates the use of CPU (and vice versa): writing and reading to/from RAM requires CPU, but also we write to RAM only when we expect the program to read it in the future. This means that any CPU we use, takes away the ability of another program to use some RAM (because using RAM requires CPU). To give the basic intuition for this, I mentioned the extrme example of a program that uses 100% of CPU. Such a program effectively captures 100% of RAM no matter how much of it it actually uses because no other program can use any RAM, as no other program has the CPU available to access it. You don't need to know anything about what other programs do. Another way to think about this is that the machine is spent whenever the first of RAM and CPU is exhausted.
2. On the supply side, RAM and CPU - whether in metal or in virtualised hardware - are effectively sold as a package (it's hard to get less than 1GB per core, except on embedded devices these days). Furthermore, both moving GCs and (to a far lesser extent) memory allocators can trade RAM and CPU (sophisticated memory allocators aren't quick to return RAM to the OS and maintain internal buffers).
So even though it is true that different programs may have different CPU/RAM usage patterns, you have to think about the ratio rather than CPU and RAM in isolation, and try to achieve some approximate balance. To put it simply, if a program uses a lot of CPU it doesn't make sense for it to use little RAM, because by using a lot of CPU it is effectively depriving other programs of their ability to use RAM (as that requires CPU). There are some exceptions, such as large caches, but the tradeoffs there are very different and too complicated to go into here (I did cover that in my talk).
> It's purely a dev experience decision because people don't know how to ship desktop apps. It's pure waste for the user.
No. I mean, some of it is probably waste, but:
1. What you call "dev experience" also affects the user because it directly impacts the cost of software. Users want cheaper software.
2. More relevant to this particular discussion is what else the user could do. Having "more programs open" isn't a problem thanks to SSDs and virtual memory. So we're talking about programs that are actively using the CPU for something, and they, too, need a balance of the RAM/CPU ratio.
I'm not trying to be dogmatic in the other direction and assert that Electron is necessarily the best tradeoff. But I'm saying that efficiency is ultimately about money that is spent on a combination of RAM, CPU, and software, and when you look at the full picture you see that it's more complicated than it seems. It's not that the software industry has decided to waste users' money. If it did, there would be a competitive edge to programs that use less RAM, but we don't see that competitive edge. What we do see is a few people on HN saying how they simply can't live with VS Code's 50ms keystroke latency and how amazing is some other editor with only 20ms latency that's likely to go out of business soon [1]. The people who made these decisions aren't some early-career developers who just like hot code reloading or some such.
[1] Yes, I do think Rust is more hardware-efficient than JS, but here I'm looking at an even bigger picture. And yes, if you rewrite from JS to C++/Java/Rust/Go you can win on hardware, but as I said at the very beginning, any such rewrite is not really "an optimisation".
> To put it simply, if a program uses a lot of CPU it doesn't make sense for it to use little RAM
The fallacy in that reasoning is that a program that's using a lot of CPU (especially if it's a huge MLOC-sized app) is most likely using up its CPU on memory throughput, not pure number-crunching compute! So at least for the enterprise app case (not pure number crunching), you'd actually need a tunable tradeoff between memory throughput and total RAM footprint, and adopting copying/moving GC's just doesn't give you that. Collections cycles are a huge burden on memory throughput: thus, indirectly, on the very thing you're calling "CPU". The theoretical prospect of winning by forgoing collections cycles outright (pure bump arena allocation) is explicitly excluded here since we're talking about long-running programs that will at some point need to garbage collect.
Heap allocation may have marginally higher "CPU" use in the pure compute sense, but that's exactly the kind of CPU use that does trade off successfully with a lower RAM footprint.
Similarly, non-moving concurrent garbage collectors like Go's also successfully navigate this tradeoff compared to moving/copying collectors, because their collection work, while compute- and to some extent memory-traffic intensive (though less so than if copying/moving memory was involved!) can be largely (though not completely - some minor compute overhead on the hot path is still present) shunted off to a lower-priority background thread.
On the other side of the tradeoff, arenas and caches increase memory footprint in a way that's low-impact on memory throughput (unlike pervasive use of a copying/moving GC) because only live data is accessed as needed, and deallocating the arena is a single operation. The tradeoff is actually highly favorable to low-level languages, which commonly use arenas to manage challenges with heap allocation such as fragmentation.
> The fallacy in that reasoning is that a program that's using a lot of CPU (especially if it's a huge MLOC-sized app) is most likely using up its CPU on memory throughput, not pure number-crunching compute!
No, there's no such fallacy here because that assumption is not needed for the conclusion. The point is that CPU is needed to use RAM, and so if you use CPU for whatever reason - even to loop for an hour over some integers - you are consuming a resource that is needed to use RAM (by another program). So the use of CPU "captures" RAM whether it uses RAM or not, so it might as well use it.
The extreme example I gave was that a program that uses 100% CPU (again, even if it uses zero RAM) effectively capures 100% of RAM because no other program can use any RAM while that program is running. This extreme example is just to build some intuition, but it scales to lower CPU utilisations.
> Collections cycles are a huge burden on memory throughput
I don't even know where to start. The whole point of moving collectors is that the can make the cost of memory management arbitrarily low, reduing the overhead compared to free list approaches. A collection cycle does a constant amount of work (per program & workload), but the frequency of the collections can be made arbitrarily low. This is memory management 101.
The problem of moving collectors has traditionally been the impact on latency, not on throughput (they were always better on throughput than free lists) - until the advent of pauseless moving collectors.
> Heap allocation may have marginally higher "CPU" use in the pure compute sense, but that's exactly the kind of CPU use that does trade off successfully with a lower RAM footprint.
Except it doesn't given the actual economics of RAM and CPU. You'll need to wait for my talk to be posted to YouTube (I can't reproduce it all here), but in the meantime you can watch this one, by my colleague, which was a keynote at the most recent ISMM (International Symposium on Memory Management): https://youtu.be/mLNFVNXbw7I
The problem is that Erik is one of the world's leading experts on memory management, and he's talking to other experts, so his talk assumes quite a bit of familiarity with the subject. Also, his comparison focuses on tracing GCs, leaving the one to malloc/free implicit.
> Similarly, non-moving concurrent garbage collectors like Go's also successfully navigate this tradeoff compared to moving/copying collectors
Again, except they do not. Go users experience severe problems with the GC that Java users no longer do precisely because of the inefficiencies of their simple GC (the JDK used to have such a GC, but we removed it five years ago when newer, more sophisticated algorithms yielded better results.
> On the other side of the tradeoff, arenas and caches increase memory footprint in a way that's low-impact on memory throughput (unlike pervasive use of a copying/moving GC)
This is simply not true, and shows unfamiliarity with how modern moving collectors actually work (remember that the first open-source high throughput, paseuless moving collector was first released two and a half years ago). Moving collectors offer pretty much the same tradeoff as arenas. The key points are:
1. A generational design makes copying a relatively rare operation to begin with (only a relatively small number of objects are copied).
2. The frequency of collections can be made arbitrarily low.
If the usage happens to be arena-like, i.e. no objects survive, nothing is copied (unrelated long-lived objects are already in the old gen, and because the old gen is untouched, there's no need to compact anything there).
The reasons for not using moving collectors have nothing to do with throughput:
1. Latency used to suffer. Low-latency moving collectors were a very advanced technology. The first open-source one is younger than ChatGPT.
2. Moving collectors impact the design of FFI (with C, etc.), as C (etc.) does not support moving pointers for reasons having nothing to do with performance. It's very hard for languages that want a very direct and simple FFI (as FFI is very common in code) have a hard time implementing efficient moving collectors.
3. Good moving collectors require large expert teams (let alone pauseless moving collectors). The languages that have them (to varying degrees of sophistication and performance) are well funded ones. In particular, they are the JDK team, the .NET team, and the V8 team. Good allocators (for malloc/free) are also big and sophisticated beasts these days, but they're much easier to reuse in different languages (i.e. not only by C/C++/Zig/Rust, but also by Python). Effectively, large pieces of those languages' runtime is "offshored" to unrelated specialist teams.
> The tradeoff is actually highly favorable to low-level languages, which commonly use arenas to manage challenges with heap allocation such as fragmentation.
This is also not true (and I say this because I'm primarily a low-level programmer, and have been doing low-level programming for over 25 years). First, C++ and Rust in particular make arenas hard to use to their full power (which is one of the several reasons low-level programmers prefer Zig). Second, if you're not familiar with the severe costs of memory management in low level languages, I can only conclude you haven't been doing it for very long.
Java was designed, among other things, to reduce the severe and hard-to-fix performance problems that many C++ programs had experienced (and do to this day). The hard problems concern both the limitations of AOT compilation and of free-list-based memory management. I'm not saying all C++ programs suffer from these issues, but a huge class of them do, which is one of the reasons large programs have migrated to Java.
I feel like this has been a great discussion but all the good technical talk is tapering off. There's more rhetorical semantics now than anything. But I appreciate you teaching me. I'm a bit tired in this reply.
I wish I could find you a few reports from people on here basically renouncing Java because they could not optimize it any further after 20 years programming in it, and moving to Rust. I'd be curious what you'd think.
> You say "dev experience" as if it's some quality-of-life thing. They traded off a cheap resource, RAM, for an eternal maintenance and evolution cost that would only grow higher as the program grows
No, I say dev experience because I brought it up earlier, and staked my claim on it. Because it's an umbrella term that covers nice-to-haves and how ergonomic the language and ecosystem are. The antithesis would be lots of repetitive plumbing that slows down feature release. It's one part of the triangle. They now are shipping way behind but wound up with a product that is just as fast and uses less RAM. That kind of decision can matter to other projects. Smaller projects likely wouldn't have such a slow turnaround.
Again, we're not advocating for everyone to do rewrites. Threads like this are people begging app developers to stop using stuff not appropriate for desktop apps.
> I said that low-level languages can offer good performance in small programs, but many desktop apps aren't small. Claude Code's CLI is over
Err, Claude Code is very special, yes. Couldn't tell you why that tool needs to use that much, but most desktop apps don't. They are built on vendor code and keep the actual app code small, and that makes them excellent candidates for what we're talking about.
> 3MLOC is quite typical. People who work on smaller software are overrepresented in Silicon Valley and, I'm guessing, among HN readers, but they're the outliers
I don't agree, unless you have some stats I don't know about. I mean I'm really jealous that you even know someone that worked on an app of that size. Most people are coding for one of the millions of mid sized businesses dotted all over the country. They are on 20 year old code bases that make great revenue. Everyone there is nice and meets with you weekly. The devs answer to the clients directly. They don't really need to grow their business endlessly, but there's lots of maintenance to do. I've worked with Healthcare companies, they were certainly not 3M lines of code. What you're describing is an extremely narrow class of software that most people will never touch. But I think it sounds cool.
> If anything, benchmarks are much more anecdotal
Not really. I concede they only measure what they do, and it might not be much, but at least they measure something, and it's public and reproducible. Anecdotes are vague stories that are impossible to evaluate. I had an anecdote of someone saying they can't use Java anymore because, even with 20 years of experience, they cannot optimize Java any further for what they need. They rewrote it in Rust. It works much better now, and it's not even close. What am I to make of that?
> Her computer is slow not because shes using a program that eats up a lot of RAM, but because she's inadvertently running a lot of stuff in the background that shouldn't be running at all
There are absolutely apps that run 8gb of RAM. And pageswaps are not good, even with an SSD. They're a real problem.
I just find it lame to tell people this when we could ship leaner apps and it wouldn't even be that hard. It's 2026 and people should be able to have whatever open windows they want. Even 8gb of Ram is a lot, we've just forgotten about it. Shit, my web browser uses 4GB ram idle.
> So even though it is true that different programs may have different CPU/RAM usage patterns, you have to think about the ratio rather than CPU and RAM in isolation, and try to achieve some approximate balance. To put it simply, if a program uses a lot of CPU it doesn't make sense for it to use little RAM, because by using a lot of CPU it is effectively depriving other programs of their ability to use RAM (as that requires CPU). There are some exceptions, such as large caches, but the tradeoffs there are very different and too complicated to go into here (I did cover that in my talk).
That's a cool realization. But I think it's slippery. Only in extreme scenarios will your CPU actually block RAM. The 100% usage scenario makes sense. But most of the time, your CPU is going to be underutilized and capable of letting every app use RAM freely. Obviously the more direct problem would be someone's RAM was sucked up by different apps.
> It's not that the software industry has decided to waste users' money. If it did, there would be a competitive edge to programs that use less RAM, but we don't see that competitive edge. What we do see is a few people on HN saying how they simply can't live with VS Code's 50ms keystroke latency and how amazing is some other editor with only 20ms latency that's likely to go out of business soon
How? If you're forced to use work software, you have no competition to go to. Same for your music app, your social network, your team's chat tool. The "competitive edge" argument requires users to actually have a choice, and for most desktop software they don't they use what their employer, school, or social network has standardized on. Where users do have free choice, they gravitate toward leaner options constantly. Sublime kept paying customers against free Electron alternatives. Mobile platforms enforce resource discipline and have no Electron equivalent. The competitive edge for leanness exists; it just can't express itself when ecosystem effects lock users in.
So I still feel strongly there are cases where you could make sufficiently small programs in Rust that wouldn't devolve into spaghetti. That would give you great performance and ram usage. I still want to find the example of my anecdote, but I'm tired.
I feel like this has been a great discussion but all the good technical talk is tapering off. There's more rhetorical semantics now than anything. But I appreciate you teaching me. I'm a bit tired in this reply.
I wish I could find you a few reports from people on here basically renouncing Java because they could not optimize it any further after 20 years programming in it, and moving to Rust. I'd be curious what you'd think.
> You say "dev experience" as if it's some quality-of-life thing. They traded off a cheap resource, RAM, for an eternal maintenance and evolution cost that would only grow higher as the program grows
No, I say dev experience because I brought it up earlier, and staked my claim on it. Because it's an umbrella term that covers nice-to-haves and how ergonomic the language and ecosystem are. The antithesis would be lots of repetitive plumbing that slows down feature release. It's one part of the triangle. They now are shipping way behind but wound up with a product that is just as fast and uses less RAM.
... and will cost much more to evolve, costs will never drop and may well rise. These costs are higher than the RAM they saved, which was free anyway, because it couldn't be used for anything else.
> That kind of decision can matter to other projects. Smaller projects likely wouldn't have such a slow turnaround.
Again, we're not advocating for everyone to do rewrites. Threads like this are people begging app developers to stop using stuff not appropriate for desktop apps.
The people begging are sometimes right and sometimes really wrong. They are sensitive to certain things but don't consider the full picture.
> I said that low-level languages can offer good performance in small programs, but many desktop apps aren't small. Claude Code's CLI is over
> Couldn't tell you why that tool needs to use that much, but most desktop apps don't. They are built on vendor code and keep the actual app code small, and that makes them excellent candidates for what we're talking about.
VS Code is something like 5MLOC. Slack is probably around a million. Every desktop app (that isn't bundled with the OS) that I or people I know use are roughly that size or bigger.
> What you're describing is an extremely narrow class of software that most people will never touch. But I think it sounds cool.
First of all, this is the class of software that most people rely on the most by far. It certainly contributes more economic value than other software. You use that software every time you tap your bank card; every time you place or receive a call or send or receive a text message; every time a package arrives at your doorstep; every time you watch any video on any platform; every time you receive medical treatment or stay at a hotel. Clearly, it's the class of software that contributes the majority of value that software delivers. I don't know exactly how many people work on such software. I think around 50% of developers at least, but even if I'm wrong, we're talking at least a few million developers.
> but at least they measure something, and it's public and reproducible.
There is zero value to a measurement that is not relevant to you, and negative value to making you think it's relevant ("well, it's not the number I need but it's some number I have so I'll go by that") when it's not.
> Anecdotes are vague stories that are impossible to evaluate.
You can evaluate them at least as well as you can benchmarks - by asking questions that will allow you to know if the information is relevant to your use case - only they at least have a chance of being relevant, while benchmarks really rarely are. I will just say that if you don't know how exactly a memory allocator is implemented (e.g. whether and how it degrades with time), it is absolutely impossible for you to evaluate a benchmark that purports to measure memory management. There is nothing you can learn from it because you don't really know what it is that's been measured.
> I had an anecdote of someone saying they can't use Java anymore because, even with 20 years of experience, they cannot optimize Java any further for what they need. They rewrote it in Rust. It works much better now, and it's not even close. What am I to make of that?
You are to make of it that, in the past 20 years at least, we have no ability to extrapolate from one program to another. Given that languages like C++, Rust, Java, Zig, and C# are all at the topmost performance category, it makes a lot of sense that in some situations X will be faster than Y and in others Y will be faster than X.
> There are absolutely apps that run 8gb of RAM. And pageswaps are not good, even with an SSD. They're a real problem.
I've not seen that in quite a few years now. Right now I have Chrome open on four streaming platforms. Just over 500MB. I have over 100 tabs open in Safari. About 1.2GB. I'm sure there are some apps that use 8GB of RAM, but these aren't apps that grandma uses or wants to use.
> That's a cool realization. But I think it's slippery. Only in extreme scenarios will your CPU actually block RAM. The 100% usage scenario makes sense.
No. This scales. Again, every CPU cycle you consume takes a cycle away from some other program and reduces its ability to use RAM (as that requires the cycle).
> But most of the time, your CPU is going to be underutilized and capable of letting every app use RAM freely.
No. Using RAM means using CPU. I you're idle, then you're not using RAM. Might as well be paged to SSD. You get zero points for keeping RAM significantly more plentiful than free RAM. You save $0.
> How? If you're forced to use work software, you have no competition to go to. Same for your music app, your social network, your team's chat tool. The "competitive edge" argument requires users to actually have a choice, and for most desktop software they don't they use what their employer, school, or social network has standardized on. Where users do have free choice, they gravitate toward leaner options constantly. Sublime kept paying customers against free Electron alternatives. Mobile platforms enforce resource discipline and have no Electron equivalent. The competitive edge for leanness exists; it just can't express itself when ecosystem effects lock users in.
The competitive edge does not require users to have a choice. It requires a real edge. If some software really makes you more productive, then your employer would be foolish not to buy it. If some software allows the school to significantly save on hardware - the same. The reason these products don't catch on is because they're written by hackers with certain sensitivities who do not understand the economics of their users' hardware and software.
> So I still feel strongly there are cases where you could make sufficiently small programs in Rust that wouldn't devolve into spaghetti. That would give you great performance and ram usage.
Sure. Like I said, low-level programs are fast and efficient for small programs. But even then, you need to look at the full picture to know how much, if any, money you're saving.
Thank you for walking me through the CPU cycle hogging RAM thing, that makes more sense. Obviously I have a lot to learn here. For the record, I don't have a problem with Java. And in rereading the whole conversation, I can see you stipulated for exactly what I'm arguing about early on
> The question isn't why apps use a lot of RAM, but what the effects of reducing it are. Redcuing memory consumption by a little can be cheap, but if you want to do it by a lot, development and maintenance costs rise and/or CPU costs rise, and both are more expensive than RAM, even at inflated prices
That sums up what I've been getting at much better.
> and will cost much more to evolve, costs will never drop and may well rise. These costs are higher than the RAM they saved, which was free anyway, because it couldn't be used for anything else
I guess I'm suggesting If it evolves. My experience with smalller apps is spread across dozens of businesses that are 20+ years old. 1M LOC is not a required thing. In fact, for many businesses, you probably couldn't reach that many LOC without inventing busy work for devs. Sometimes my job is ripping features OUT that a different dev company put in, and nobody knows why anymore.
> The people begging are sometimes right and sometimes really wrong. They are sensitive to certain things but don't consider the full picture
That's a true statement in isolation. I think it's reasonable to not want a web browser embedded in multiple apps on your computer. Slack and Spotify use more RAM than Steam. For what each app does, that seems absurd to me. Again, that's not a bad tradeoff from a development velocity perspective.
> First of all, this is the class of software that most people rely on the most by far. It certainly contributes more economic value than other software.
But the fact that this type of software has more devs is different from saying the average project has the same considerations. I wouldn't tell someone to use Kubernetes because FAANG uses it, and that means a lot of devs use it. If you estimate 50% of developers to be working on this kind of software, I estimate 5% of them has any choice over what tech stack they are using in the first place. So when you are making tech stack recommendations and saying "C++ is not fast for apps", you are talking to the other 50%.
> There is nothing you can learn from it because you don't really know what it is that's been measured
That's true. I downloaded the benchmarks and ran them myself and played around with them. But I lean on others for technical evaluation. My understanding of low level programming ends at toy projects and what I've read about cache/cpu. I imagine if you develop the JVM it's frustrating to continuously talk to people about isolated benchmarks.
Edited out a lot of rhetorical arguing after rereading the conversation
> If it evolves. My inexperience with apps that large is spread across dozens of businesses that are 20+ years old. 1M LOC is not a required thing.
Just to be clear, the cost of maintaining a program in a low-level language is always higher. That's easily the #1 reason the use of low-level languages has been declining steadily for a few decades now with no hint of a change in direction. What happens in large programs is that low-level languages become slow. So yes, if that program doesn't grow, it will probably not become slow, but they've already paid more on development and continue to pay more on maintenance than any savings they could have made on memory, which are probably zero or nearly that.
The point of my explanation about the RAM/CPU relationship is that a well-balanced ratio is free. If your CPU usage amounts to some X% of RAM "captured" any memory savings below it translates to $0 in savings. It's sort of like ink and paper. They're used in combination so reducing the consumption of one without the other doesn't really save you anything.
> I think it's reasonable to not want a web browser embedded in multiple apps on your computer.
I don't know why that would be reasonable unless you can show me it's a waste of money. Maybe it is, but I'm not sure.
> Slack and Spotify use more RAM than Steam. For what each app does, that seems absurd to me.
But software is written to deliver value to users. Most software has no intrinsic value. Sometimes in an economy you get things that may seem absurd - I can't think of a good example, but say that you can only buy rope in units of 1m - but make sense once you consider the entire system. Could Slack use much less RAM than Steam? Of course! Should it, though? I don't know.
> Again, that's not a bad tradeoff from a development velocity perspective.
And again, what you call "development velocity" is not some vanity metric, but something that can translate to actual money savings for the user more than reducing RAM consumption.
> Do you disagree that making the equivalent app in Avalonia, JavaFX, QT would likely use less RAM and CPU than Electron? Is there not room to trip RAM usage in the Desktop world without harming the CPU?
There probably is, but as I said in the beginning, switching a language is a large investment, not exactly an optimisation (and it might not be worth it).
> But the fact that this type of software has more devs is different from saying the average project has the same considerations.
Well, that depends what you mean by "the average project". If we're counting by number of programs/repos, the median project size may well be a 100 line script. We have to weigh it by something. Number of devs and lines of code are probably highly correlated, so either one would do.
> I estimate 5% of them has any choice over what tech stack they are using in the first place
I don't understand the point you're trying to make. I don't really care what someone working on some small website does because getting that tech stack wrong is of little consequence anyway. For software that "matters", the choice of tech also matters, and you're right that the junior developers (and probably many senior developers) working on those projects don't choose the tech, but somebody does, and these are the choices that matter. For example, you care about Slack's tech choice. That was also some high-level decision. If they got it wrong, it wasn't their junior programmers who made the mistake.
> I imagine if you develop the JVM it's frustrating to continuously talk to people about isolated benchmarks.
Yes, but everyone who deals with software performance has been frustrated by this for a long time. Benchmarks used to be at least somewhat more informative until the late '90s. I don't know how to educate developers more about this, but I hope someone manages to do it.
> Well that is a very different statement from what you said earlier, which is "C++ and Rust are simply not particularly fast for applications, and Java is." You have been painting a picture that it is essentially impossible to top Java with Rust except in the narrowest of situations.
It is generally hard to beat Java in large programs. It is always theoretically possible because you can view every Java program as a C++ program (which is what the HotSpot JVM is) running on some data, but it's hard, and I would say close to impossible for similar costs.
> And again, what you call "development velocity" is not some vanity metric, but something that can translate to actual money savings for the user more than reducing RAM consumption
I agree with this entirely.
> I don't really care what someone working on some small website does because getting that tech stack wrong is of little consequence anyway.
For server applications and not tools, probably not. Should Curl have a slower startup time? Probably not.
> but they've already paid more on development and continue to pay more on maintenance than any savings they could have made on memory, which are probably zero or nearly that
One key detail of desktop apps is every performance compromise is multiplied across all your users. It doesn't make much business sense to spend hundreds of dev hours to spare your server 4gb of RAM. But it has a much bigger impact across a growing number of users.
In the server case, you are offloading Dev velocity to your own RAM cost. In the Desktop app case, you are offloading Dev velocity to everyone else's ability to run programs on their own computer.
> I don't know why that would be reasonable unless you can show me it's a waste of money. Maybe it is, but I'm not sure
/rant
Well it's not reasonable as a singular business decision. If we look at everything from the lens of how you can make the most money as a company, Electron is probably the route to go right now. You can reach more users even if you upset more as you grow. For software that is targeting the things I want, I care about the health of the company because it determines how fast improvements get to me. So, in isolation, it doesn't upset me if an app uses Electron. I may not have ever had a chance to use the app if they chose something else.
The problem is when everyone does that. If the expectation becomes "everyone has a lot of RAM so just use Electron", then where does that end? Now we need more RAM, even as it gets more expensive, to run apps that are not particularly novel. It's not surprising people are growing frustrated. Dev Velocity is not a vanity metric, but it's not inherently moralistic either. Sure, a company could "scale" faster replacing all of their support staff with an AI chatbot, but that doesn't mean I have to like it.
I have much more sympathy for the small dev team than a large corporation when it comes to using Electron, specifically because their software is smaller. If I vet and use their product, it likely has less features overall, but does more of what I want. I'm more forgiving of their compromises and I can't be as picky because I chose this product.
In the case of apps like Slack, I don't get a choice to use it. I use it for Work, and they develop a lot of stuff we simply don't use. I literally just need it to send text. And so I am a bit less sympathetic to their decision to offload dev velocity costs onto my computer when I don't particularly want to use their app in the first place, and don't believe the majority of those dev hours will be used to help me.
In the case of VSCode, from what I understand, they have to spend a lot of dev time anyway to make Electron work for them. I don't think the value add is as cut and dry when your product needs to run fast.
> Should Curl have a slower startup time? Probably not.
Of course, but I don't think anyone would consider curl to be of little consequence. There are many small programs that are very important, but in general more value is in larger programs, and I don't think it's hard to see that. A large program costs tens of millions of dollars per year. Companies don't pay that unless the software more than pays for itself. When it comes to small programs, because they're small, competition is also easier. Five different people may identify the same small problem to solve with five different programs. One may end up being consequential, and the rest won't be.
> One key detail of desktop apps is every performance compromise is multiplied across all your users. It doesn't make much business sense to spend hundreds of dev hours to spare your server 4gb of RAM. But it has a much bigger impact across a growing number of users.
Yes, and I don't want to appear as if I claim that, say, Electron isn't a problem. It's just that I'm not sure it's a problem, and I'm trying to say that things are more complicated. If there is a problem, of course it affects many people, but I'm not sure there actually is one. My point about RAM isn't that it's no big deal if you waste it, but that using a lot of it might not be waste at all (in other words, that the RAM you're using is effectively free, as it cannot be used for anything else). So if an Electron program uses 6GB of RAM, and 4 of them are a waste - even if 2 of them are a waste - that's a problem. But even if you can write such a program that only uses 1GB, that doesn't mean that the other 5 are a waste at all. Using less RAM isn't necessarily more efficient if the RAM you saved can't be put to good use.
I'm also making a separate claim that even if some of that RAM is a waste, it could be offset by a lower cost of development, but these are two different claim.
In short, what I'm saying is that it's complicated.
> The problem is when everyone does that. If the expectation becomes "everyone has a lot of RAM so just use Electron", then where does that end? Now we need more RAM, even as it gets more expensive, to run apps that are not particularly novel.
It's not so simple! First, we need more RAM because we have more compute. To some degree it's like ink and paper. You can't enjoy more ink unless you also have more paper. Second, because some RAM can be converted to CPU (through moving collectors or arenas) the overall cost of running some computation can be lower if you buy more RAM. Third, once you already have that RAM, how much of a problem is it if some silly program uses a lot of it? The Electron apps I've seen have little problem being paged out to SSD, and they page in fast (paging in even 5GB takes about 2s, and you usually don't need to page in so much at once).
> It's not surprising people are growing frustrated. Dev Velocity is not a vanity metric, but it's not inherently moralistic either. Sure, a company could "scale" faster replacing all of their support staff with an AI chatbot, but that doesn't mean I have to like it.
Who is growing frustrated? Hackers on HN? If there's actual demand, and if the economics really support the claim that it could and should be done, then alternative products will have a competitive advantage. I'm always dubious when people make claims that seem to me to run counter to how the market behaves. That doesn't necessarily mean they're wrong, but it is a significant point against the claim. If a lot of people think they're paying to much for what they're getting and it's possible to pay less, such a product offering would be a huge success.
> I use it for Work, and they develop a lot of stuff we simply don't use.
Serious question: What would you be using the RAM Slack consumes for while at work?
If you could use that RAM for something more productive or if it meant your work machine could be significantly cheaper, then that's a very good argument. But if it's just about not liking to see a large number of something your boss has already paid for when a smaller number could do, even though they don't really make smaller hardware, then that could explain why there isn't a real pressure to do things differently.
> I literally just need it to send text.
Let's say the job could be done in 500KB and that Slack uses 5GB. But you already paid for 8 or 16 GB of RAM. Unless you could use that 5GB for something better while you're sending the text, why do you care that the number goes up? It doesn't cost you anything.
> In the case of VSCode, from what I understand, they have to spend a lot of dev time anyway to make Electron work for them. I don't think the value add is as cut and dry when your product needs to run fast.
I have no idea why VSCode chose Electron and whether it's a good or bad decision (I don't know how much it played a role, but I think that the ability to write plugins in JS/TS helps them, as there are so many JS/TS developers), but it doesn't bother me because the performance is good enough and it doesn't seem to hinder my use of my machine for anything else I run. If at some point it starts bothering me, I'll look for leaner alternatives.
> Other than C, Rust, Go, Swift? C# can use value types, Java cannot. So famously that Project Valhalla has been highly anticipated for a long time. Obviously the JVM team thinks this is a gap and want to address it. That is enough in itself to make someone consider a different language.
As someone working on the JVM, I can tell you we're very much interested in Valhalla and largely for cache-friendliness reasons, but Java certainly doesn't box every value today, and you are severely overstating the case. If you think you can save on both RAM and CPU by preferring a low-level language (or Go, which is slower almost across the board), you're just wrong. But I want to focus on the more important general point you made first.
> My feeling, and the feeling of most people, is that dev experience has been so heavily prioritized that we now have abstractions upon abstractions upon abstractions, and software that does the same thing 20 years ago is somehow leaner than the software we have today. The narrow claim "within a fixed design, reducing RAM often costs CPU," is true.
The problem here is that in some situations there's truth to what you're saying, but in others, it is just seriously wrong. I think the misconception comes precisely because "most poeple" these days don't have the long experience with low level programming that people in my generation of developers do, and you're not aware that many of these abstractions are performance optimisations that come from deep familiarity with the performance issues of low-level programming (I started out programming in C and X86 Assembly, and in the first long job of my career I was working on hard- and soft-realtime radar and air traffic control systems in C++).
Low-level languages aren't meant to be fast (and aren't particularly fast). They're meant to give you direct control over the use of hardware. When it comes to small software, this control does frequently translate to very good performance, but as programs get larger, it makes low-level languages slow. It is true that Java was intended to help developer productivity, but it's also meant to solve some of the intrinsic performance issues in low-level languages, which it does rather well. After all, our team has been made up of some of the world's biggest experts in optimising compilers and memory management, and removing some of C++'s overheads is very much a central goal.
So where do things go wrong for low-level languages? The core problem is that these languages split constructs into fast and slow variants, e.g. static vs dynamic dispatch and stack vs heap allocation. The programmer needs to choose between them. What happens is:
1. As programs grow larger and more complex, the direction is almost completely monotonical in the direction of the more expensive, and more general, variants.
2. There is a big difference between "a fast program could hypothetically be written" and "your program will be fast". Getting good perfomance out of low-level languages requires not only experience, but a lot of effort. For example, you can write a small benchmark and see that malloc/free are pretty fast these days, but that's often true only for the benchmark, where objects tend to be of the same size, and their allocation and deallocation patterns are regular. Memory allocators degrade over time, and they're quite bad when patterns are irregular, which is what happens in real programs, especially large ones. There's also the question of meticulous care around correctness. When Rust first came out I was very excited to see a few important correctness issues solved without loss of control, but was then severely disappointed. Almost anything that is interesting from a performance perspective for us low-level programmers requires unsafe. Even a good hashmap requires unsafe. The performance cost of safety in Rust is higher than it is in Java, and non-experts end up writing slower programs (when they're not small at least).
Such performance issues have plagued low-level programming forever, and Java is reducing these overheads. The idea that high abstractions can improve performance was possibly first stated in Andrew Appel's paper, "Garbage Collection Can Be Faster than Stack Allocation" in the eighties, in which he wrote: "It is easy to believe that one must pay a price in efficiency for this ease in programming... But this is simply not true."
Instead of a static/dynamic dispatch split, Java offers only the general construct (dynamic), and the compiler can "see through" dynamic dispatch and inline it better than any low-level compiler ever could. You can say that surely there has to be some tradeoff, and there is, but not to peak performance. The tradeoff is that 1. you lose control and can't guarantee that the optimisation will be made, so you get good average performance but maybe not the best worst-case performance (which is why it's not hard to beat Java in small programs if you know what you're doing), 2. the compiler needs to collect profiles as the program runs, which results in a "warmup" period.
(If, like me, you like Zig, you might have seen Kelley talk about the "vtable barrier" in low level languages; this doesn't exist in Java. You may also be interested in this talk, "How the JVM Optimizes Generic Code - A Deep Dive", by John Rose: https://youtu.be/J4O5h3xpIY8,
As for memory, not only do moving collectors do not degrade (or fragment) over time, they can use the RAM chip as a hardware accelerator. Unfortunately, when a program uses the GPU for acceleration it's considered clever, but when it uses the RAM chip for accelaration it's considered bloated, even though every CPU core these days comes with at least 1GB of RAM that you might as well use if you're using up the core, as that's effectively free.
The people who consider that bloated are mostly those who haven't struggled with low-level programming long enough or on software that's large enough (they're people who say, I wrote this lean and fast gizmo by myself in 5 months; 99% of value delivered by software is in software written by large teams and maintained over many years). When I was working on a sensor-fusion and air-traffic control software in the nineties, it wasn't "lean"; we just had no choice. We constantly had to sacrifice performance for correctness. Of course, once machines got better, we switched to Java for better performance. God could have written a faster program of that size in C++, but not a large team made up of people with different levels of experience. People who think C++ (or Rust) is particularly efficient are people who haven't written anything big and long-maintained with it.
In conclusion:
1. Sometimes layers of abstractions add performance overheads, and sometimes they remove it. It is not generally true that more abstraction/generality have a performance cost, especially when comparing different languages, although it is almost always true within one language (e.g. dyanmic dispatch is never faster than static dispatch, and is often slower, in C++, but dynamic dispatch in Java can be faster than even static dispatch in C++, and the tradeoffs are elsewhere). If you didn't believe that, you'd be writing all your code in Assembly (which is what I did to get the fastest programs in the early nineties, but it's just not generally faster today thanks to good optimisation algorithms in compilers).
2. Low-level languages give you control, not speed. This control typically translates to better performance in small programs and to worse performance in large ones. This performance problem is intrinsic to low-level programming.
> Removing boxing can improve layout, footprint, and CPU utilization simultaneously. That would lie outside the framework "You can't improve one without harming the other."
First, the footprint won't reduce by much. E.g., in Java, boxing could cost you 10% of your footprint, but the RAM-assisted acceleration could be 80% of the footprint.
Second, yes good layouts help CPU utilisation, but today you can't get that without giving up on other things that harm performance. Dynamic dispatch and memory management in C++ and Rust are just too slow, and while Zig can be blazing fast, it's not easy to write large software in it without compromising performance any more than in any other low-level language. I hope that with Valhalla, Java will be the first language to let you enjoy everything at once, but it's not really an option today.
> I'm saying Electron uses a lot of RAM and it has nothing to do with offloading work from the CPU, and everything to do with taking the most brute force approach to cross app deployment that we possibly can.
That developers choose it because it's "brute force approach to cross app deployment" doesn't necessarily mean that it doesn't also offload work from the CPU, but yes, Electron apps are probably very inefficient from some perspectives. But I think this is also overstated by people who are overly sensitive. When we say something is inefficient, it means that we spend on it more than we have to, but what we really mean is that we could spend that resource that we save on something else instead. On my M1 laptop, I comfortably run three electron apps and two browsers simultaneously without much harming the speed at I can, say, compile HotSpot, probably because SSDs are fast enough for virtual memory in interactive GUIs. I can't think of anything else I could use my laptop's resources for if the apps were leaner on RAM. Reducing the consumption of a resource that can't be meaningfully used for other work isn't real efficiency, and if it comes at the expense of anything useful, it's downright inefficient.