Hacker Newsnew | past | comments | ask | show | jobs | submit | kennywinker's commentslogin

Calling the GPUs the shovels is bonkers because a) shovels are cheap, GPUs are not. And b) when you build a bridge the bridge doesn’t need shovels to be passable. Without GPUs, the datacenter is useless, the model is useless, etc.

If anything, the GPUs are the steel that the bridge is made of. Each beam can be replaced, but if too many fail the bridge is impassible. A bridge with a 6 year lifespan for each beam is insane.


You’re taking the metaphor way too literally. The people who made the most profit weren’t literally selling shovels, they were the ones providing logistics and support services to the gold miners, like hauling tons of equipment over tens of miles of mountain or providing the sales channel for the gold. They siphoned off most of the profit from the ventures that depended on them (like LLMs depend on GPUs) because the miners had no other choice, to the point where even the most productive mines often weren’t profitable at all.

A less literal example is the conquistadors: their shovels were ships, horses, gunpowder, and steel. You can look at Spanish records from the Council of the Indies archive and any time treasures were discovered, the price of each skyrocketed to the point where only the wealthiest hidalgos and their patrons could afford to go on such adventures. I.e. the cost of a ship capable of a cross Atlantic voyage going from 100k pieces of eight to over a million in the span of only a few years (predating the treasure fleet inflation!)

Gold rushes create demand shocks, and anyone who is a supplier to that demand makes bank, regardless of whether its GPUs or “shovels”.


> You can look at Spanish records from the Council of the Indies archive and any time treasures were discovered, the price of each skyrocketed to the point where only the wealthiest hidalgos and their patrons could afford to go on such adventures.

Today this is real estate. And it's something people keep forgetting when arguing that ${whatever breakthrough or just more competition} will make ${some good or service} cheaper for consumers: prices of other things elsewhere will raise to compensate and consume any average surplus. Money left on the table doesn't stay there for long.


GPUs don't really have six year lifespans, though. The hardware itself lasts far longer than that, even hardware that's been used for cryptomining in terrible makeshift setups is absolutely fine for reuse.

Each of these GPUs pull up to a kilowatt of power. The average commercial power cost is 13.4 ¢/kWh. That means running a single H100 full tilt 24/7 is a power operationing cost of $1,100 per card per year.

In three years the current generation of GPUs will be 50% or more faster. In six years your talking more than 100% faster. For the same energy costs.

If you're running a GPU data center on six year old GPUs, your cost to operate per sellable unit of work is double the cost of a competitor.


One thing I am not entirely sure if there will be huge efficiency gains. Just looking at TDP that is the power consumption of say 3090 and 5090 and the increase is substantial then compare it to performance and the performance lift stops looking that great...

3x increase in compute for a 1.5x increase in tdp is pretty good considering the underlying process had barely changed. In anycase, consumer GPUs aren't a good metric as they operate with different economic constraints.

H100 to GB200 saw a 50x increase in efficiency, for example.


https://www.nvidia.com/en-us/data-center/gb200-nvl72/

Nvidia only advertises 25x efficiency. And that is their word...


Sure. But if that fully depreciates, $1100/year GPU produces $20k of economic benefit, would you decommission it as long as there is demand?

I want to see math on how a single GPU will pull down that much revenue, because that seems like a dubious outcome.

Fair, I was hand waving to make a point. “If it generates more than $1100 + (resale price * WACC) + opportunity cost from physical space/etc” would have been more accurate.

But the point is — you don’t decommission profit generators just because a competitor has a lower cost structure. You run things until it is more profitable for you to decommission them.


That all depends on if you're running your own hardware (unlikely) or renting.

If my data center sells a pflop at $5 because of our electricity use and the data center a state over with newer GPUs sells it at $2.50/pflop, it doesn't matter how much economic benefit it generates, my customers are all going to the data center a state over.

In context of datacenter using AI workloads, it's cheaper to replace them after few years with faster, more energy efficient ones, because the power cost is major factor

> A bridge with a 6 year lifespan for each beam is insane.

Not necessarily. Depends entirely on the value of the transport that the bridge enables.


For resin printing, doing it yourself almost never makes sense. It’s expensive, fiddly, messy, hazardous to your skin and lungs, and consumes a lot of space to do right.

Filament printing, on the other hand, makes sense to do yourself quite often. A $200 printer will do an excellent job of most things you can throw at it, it doesn’t take up much space, is quite safe unless you’re using weird filaments, and even a kid can learn the basics in a couple days.


For some the appeal of agent over human is the lack of accountability. “Agent, find me ten targets in iran to blow up” - “Okay, great idea! This military strike isn’t just innovative - it’s game changing! A reddit comment from ten years ago says that military often uses schools to hide weapons, so here is a list of the ten most crowded schools in Iran”

It must be wild to actually go through life believing the things written in this post and also thinking you have a rational worldview.

> This creates a recurring pattern on r/LocalLLaMA: new model launches, people try it through Ollama, it’s broken or slow or has botched chat templates, and the model gets blamed instead of the runtime.

Seems like maybe, at least some of the time, you’re being underwhelmed my ollama not the model.

The better performance point alone seems worth switching away


I follow the llama.cpp runtime improvements and it’s also true for this project. They may rush a bit less but you also have to wait for a few days after a model release to get a working runtime with most features.

Model authors are welcome to add support to llama.cpp before release like IBM did for granite 4 https://github.com/ggml-org/llama.cpp/pull/13550

I believe Bermuda is a tax shelter country, which means people and companies register there to hide identity and income from the nations they live and do business in. Because of that, the vast majority of businesses registered in bermuda are not legitimate institutions - they are shell companies defrauding their home nations.

And the home nation's governments defraud their people with unnecessary wars, wasteful spending, unpayable debt, and excessive inflation. There comes a time when paying less tax is the right thing to do.

I can think of few groups as likely to support wars than the ultra rich, but if you are very wealthy and don’t like your tax dollars going to military spending just invest in lockheed or raytheon and get it all back as dividends. War spending doesn’t justify tax fraud, unless you’re also out on the protest line when a new war breaks out.

As the top tax rates fell, from 90% in 1950 to under 40% now - the use of tax shelters increased. So unless your “comes a time” is referencing pre 1915 USA, this isn’t a valid justification.

If inflation is the issue, keep your money in a different currency.

I just don’t see actions from the very rich (the ones using tax shelters) that back up your justifications.

I think it’s simply the collapse of any kind of cohesion between the wealthy and the nation in which they live. Or put another way: I’m rich, i shouldn’t have to pay for stuff i don’t use!


Why are you even defending this practice? It's something very wealthy people do, they're not your everyday citizens conscious about how their taxes go.

They evade taxes for financial reasons, not moral reasons.


Amazing! Let me see, doing the math r/n… carry the one, yup that makes the total number of non-speculative uses for crypto and stablecoin: 1

;P


It always has been payments. x402 and Stripe Tempo makes the use case more than 1.

Their estimate is based on significantly lower consumption when under load. E.g. 25W for an M4 Pro mac mini. I have no idea if that’s realistic - but the m4s are supposedly pretty efficient (https://www.jeffgeerling.com/blog/2024/m4-mac-minis-efficien...)

Their example big earner models are FLUX.2 Klein 4B and FLUX.2 Klein 9B, which i imagine could generate a lot more tokens/s than a 26B model on your machine.

For Gemma 4 26B their math is:

single_tok/s = (307 GB/s / 4 GB) * 0.60 = 46.0 tok/s

batched_tok/s = 46.0 * 10 * 0.9 = 414.4 tok/s

tok/hr = 414.4 * 3600 = 1,492,020

revenue/hr = (1,492,020 / 1M) * $0.200000 = $0.2984

I have no idea if that is a good estimate of how much an M5 Pro can generate - but that’s what it says on their site.

They do a bit of a sneaky thing with power calculation: they subtract 12Ws of idle power, because they are assuming your machine is idling 24/7, so the only cost is the extra 18W they estimate you’ll use doing inference. Idk about you, but i do turn my machine off when i am not using it.


Interesting token numbers they're using, because I've benchmarked it at 69 tok/s single steam and 130 multi stream.

I have a hard time believing their numbers. If you can pay off a mac mini in 2-4 months, and make $1-2k profit every month after that, why wouldn’t their business model just be buying mac minis?

The numbers are optimistically legit -- it's calculated based purely considering we have demand for all machines at all times. We don't have that right now, but fairly optimistic that people will do it.

That's why we don't recommend purchasing a new machine. Existing machine is no cost for you to run this.

Electricity is one cost, but it will get paid off from every request it receives. Electricity is only deducted when you run an inference. If you have any questions, DM me @gajesh on Twitter.


> That's why we don't recommend purchasing a new machine. Existing machine is no cost for you to run this.

You misunderstood. If the ROI is there, there is enough capital in existence for you to accelerate your profit. So why even deal with the complexity of renting people's hardware when you can do it yourself?


No, what he's saying is that he expects this to be the ROI in the future because his product is so good.

> Existing machine is no cost for you to run this.

That is not at all how modern chips work. Idle chips are mostly powered down, non-idle ones are working and that causes real measurable wear and tear on the silicon. CPU, RAM, NAND all wear and tear measurably with use on current manufacturing processes.

https://en.wikipedia.org/wiki/Electromigration


Like pitching "drive rideshares for only the cost of your time & gas"

The question is, do they wear faster than they become obsolete, as in much more expensive to run than buying a new one with higher compute/watt. (and you can also factor in the ability to run latest models at usable speed)

It’s complicated. When you design with modern PDKs, you consider the expected duty cycle of the device, expected temps, and the wear and tear on the silicon all together. That affects the layout of the chip as well as certain choices about widths of various features. Generally, one designs consumer SoCs to last 10 years with the expected duty cycle (low). With more wear you could run out of your “years" much faster, maybe even before the warranty.

You're not taking into account the thermal strain on the machine, though. A machine that's 100% utilized (even worse if it's in bursts) will last less than an idle machine.

Not appreciably, and not before a 5-yr AppleCare+ warranty expires.

Out of our >3000 currently active Apple Silicon Macs, failures due to non-physical damage are in the single digits per year. Of those, none have been from production systems with 24/7 uptime and continuous high load, which reflects your parenthetical.

Perhaps we haven't met the other end of the bathtub curve yet, but we also won't be retaining any of these very far beyond their warranty period, much less the end of their support life.


AppleCare+ annual is perpetual as long as you keep paying it (and Apple offers to switch to that when your 3-year lump sum expires if you choose that instead). I’m guessing it ends whenever they officially discontinue hardware support, which has traditionally been about 7 years after the last unit is produced, but I haven’t reached that yet to know for sure.

I think the point of this is more "use the machine you have at home" than "do a TCO analysis and see if it's profitable", though. People like to keep their machines working for longer, generally.

> Not appreciably, and not before a 5-yr AppleCare+ warranty expires.

It’s three years for Macs, though I believe you can pay annually for longer. Five has never been a thing to my knowledge.


> A machine that's 100% utilized (even worse if it's in bursts) will last less than an idle machine.

How much though? Say I have three Mac Minis next to each other, one that is completely idle but on, one that bursts 100% CPU every 10 minutes and one that uses 100% CPU all the time, what's the difference on how long the machines survives? Months, years or decades?


I don't worry about bandwidth or constant CPU use, but the one thing that will kill my mac is burning out the SSD.

The calculator gives numbers for nearly everything, but I can't obviously see how much space it needs for model storage or how many writes of temp files I should expect if I'm running flat out.


I assumed you'd want to run one immutable model that can fit in memory without any temp files.

put that stuff on an external disk perhaps, it will eventually crater, but it's easier to replace than macbook internal storage (how are they doing mac minis these days?)

Isn't high memory pressure will result in high memory paging which will wear down the internal ssd?

My question is why did you have to design this to use an MDM instead of a simple program running in the terminal or something?

If you start buying minis, then you need to house, power, and cool them. So you are building a mini data center. If you are building a small data center, economies of scale will drive you to want to build larger and larger. However, this gets expensive and neighbors tend to not like data centers (for good reason). To me this seems like asymmetric warfare against hyper-scalers.

Yup. This way, the people pay for the air conditioning themselves and they probably don't even notice the extra cost.

& if they live in a cool place, they're getting a small space heater as a bonus.

Because they don’t have that much initial money in their pocket, while the idle computer is already there, and the biggest friction point is convincing people to install some software. Both producing rhetoric and software are several order of magnitude cheaper than to directly own and maintain a large fleet of hardware with high guarantee of getting the electrical stable input in a safe place to store them.

Assuming that getting large chunk of initial investment is just a formality is out of touch with 99% of people reality out there, when it’s actually the biggest friction point in any socio-economical endeavour.


No provider maintains 100% utilization of GPUs at full rate. Demand is bursty - even if this project is successful, you might expect, e.g., things to be busy during the stock market times when Claude is throwing API errors and then severely underutilized during the same times that Anthropic was offering two-for-one off peak use.

And then there's a hit for overprovisioning in general. If the network is not overprovisioned somewhat, customers won't be able to get requests handled when they want, and they'll flee. But the more overprovisioned it is, the worse it is for compute seller earnings.

I suspect an optimistic view of earnings from a platform like this would be something like 1/8 utilization on a model like Gemma 4. Their calculator estimates my m4 pro mini could earn about $24/month at 3 hours/day on that model. That seems plausible.



What's your Y axis?

Solid q. I think the part of it is that it’s really easy to attract some “mass” (capital) of users, as there are definitely quite a few of idle Macs in the world.

Non-VC play (not required until you can raise on your own terms!) and clear differentiation.

If you want to go full-business-evaluation, I would be more worried about someone else implementing same thing with more commission (imo 95% and first to market is good enough).


I think the point they’re making though is that the numbers seem too good to be true.

ie. Does anyone know the payback time for a B100 used just for inference? I assume it’s more than a couple of months? Or is it just training that costs so much?


Eigenlayer (which this is spun off from) is a massively VC-funded crypto company.

It is too good to be true. When you see it is making more than a claude code subscription for fuck all work per day.

Prolly gonna make $50 a year tops.


Or like anything else it will be too good to be true at the very beginning but then once people hear about it and it gets popular supply overtakes demand and the mac minis go back to being idle most of the day.

When YouTubers start making videos about it you know it's too late.


The question is what is the hassle of running it plus wear and tear. Max price will tend to that. It is not like crypto where there is capital investment in a rig that can do nothing else. People are using their existing laptop. So I reckon 20-50 a year max per laptop.

The numbers are obviously high, because if this takes off then the price for inference will also drop. But I still think it’s a solid economic model that benefits low income countries the most. In Ukraine, for example, I know people who live on $200/month. A couple Mac Minis could feed a family in many places.

As a business owner, I can think of multiple reasons why a decentralized network is better for me as a business than relying on a hyperscaler inference provider. 1. No dependency on a BigTech provider who can cut me off or change prices at any time. I’m willing to pay a premium for that. 2. I get a residential IP proxy network built-in. AI scrapers pay big money for that. 3. No censorship. 4. Lower latency if inference nodes are located close to me.


How many of those people who could live off $200USD/month can afford or already have a mac mini in the house?

They already have an iPhone. They could save up or borrow for a Mac Mini if they had to. Some of those people I know who live on $200/month have $30k in the bank.

then you are talking about low spend, not low income

Not really. There are lots of people who have low income and low spending, but not low savings. Retired pensioners with savings. Young families who inherited from deceased parents/grandparents. Highly paid professionals on sabbatical. I've met people from all of those categories in Ukraine who live on $200/month.

Isn't this same premise as "lets buy few GPUs to mine crypto and have passive income"? It didn't work very well and it probably won't work now either. If there is money to be made, bigger players will get in there, buy out all mac minis they can, drive price up for regular people and inevitably drive inference price down so you'll be lucky to get initial investment back

No it's not the same premise at all. Crypto doesn't do anything useful for legitimate businesses. AI inference is very useful for legitimate businesses, and so are residential IP proxies for scraping. And by definition, residential IPs cannot be centralized. And as building GPUs becomes more expensive, the existing pool of second hand unused hardware becomes more valuable, not less. The problem with crypto mining is that it quickly becomes unprofitable for small scale deployments. I'm not sure if AI inference would be, especially for the decentralized benefits of lower latency.

It is the same premise, because the person you are responding to is not talking about the moral implications at all, only about the financial / hardware implications.

Running AI inference increases the power draw, and requires certain hardware.

Mining bitcoin increases the power draw, and requires certain hardware.

OP's point thus stands: Bad players will find places to get far cheaper power than the intended audience, and will buy dedicated hardware, at which point the money you can earn to do this will soon drop below the costs for power (for folks like you and me).

Maybe that won't happen, but why won't that happen?


The main problem with crypto is there is no universal need for it. The demand for crypto doesn’t keep increasing as compute gets cheaper. But the demand for AI inference is only growing, and making it cheaper would likely only increase demand. So it’s not a race to the bottom. Sure hyper focused players can earn more at higher margins. But average players can probably still earn decently. Take for example electricity. It can still be profitable for a home in Germany to install balcony solar and make a little money selling back to the grid even though it’s obviously not as efficient as an industrial power plant. Mom and pop AI inference don’t have to be super efficient as long as they serve a universal need - it will be like balcony solar in Europe.

The residential IP proxy point is i think invalidated by their privacy model. I think they aren’t offering up your IP, just your GPU.

On the latency point - your requests are still going through the coordinator of the system here. So on average strictly worse than a large provider.

You - Darkbloom - Operator - Darkbloom - you, vs

You - Provider - you

---

On the censorship point - this is an interesting risk surface for operators. If people are drawn my decentralized model provisioning for its lax censorship, I'm pretty sure they're using it to generate things that I don't want to be liable for.

If anything, I could imagine dumber and stricter brand-safety style censorship on operator machines.


I'm not talking about Darkbloom specifically, but rather this business model in general. I'm sure a future version of Darkbloom could be P2P for better latency. Or their central operator nodes could be geo-balanced. Liability for censorship doesn't matter if it's truly zero trust. Anyway censorship is not my main concern. Low-latency decentralized inference with no US BigTech dependency is a much bigger selling point in Europe.

It's quite funny thinking about a chimpanzee seeing a lot of bananas thinking this could feed my family and then same with humans only with Mac Minis

> These are estimates only. We do not guarantee any specific utilization or earnings. Actual earnings depend on network demand, model popularity, your provider reputation score, and how many other providers are serving the same model.

Others are reporting low demand, eg.: https://news.ycombinator.com/item?id=47789171


Of course these numbers are ridiculous. Mac Mini (let's assume Apple releases M5 Pro) tops Int8 (let's assume it is the same as FP8, which it is not) at ~50 TFLOPs, with Draw Things, we recently developed hybrid NAX + ANE inference, which can get you ~70 TFLOPs.

A H200 gives you ~4 PFLOPs, which is ~60x at only ~40x price (assuming you can get a Mac Mini at $1000). (Not to mention, BTW, RTX PRO 6000 is ~7x price for ~40x more FLOPs).

Your M4 Mac Mini only has ~20 TFLOPs.


> Your computer only has ~20 TFLOPs.

What a time to be alive.


Power and racking are difficult and expensive?

How difficult? Is running 1000 minis worth $1,000,000/month of effort? I feel like it is.

And at that scale (1k) it ain't even that hard, a single room could be enough to hazardly drop them on shelves with a big fan to draw out the heat

There are many people who do not have ready access to a million dollars to purchase said Mac minis, much less the operating capital to rack & operate them.

Very smart play to build a platform, get scale, and prove out the software. Then either add a small network fee (this could be on money movement on/off platform), add a higher tier of service for money, and/or just use the proof points to go get access to capital and become an operator in your own pool.


If those numbers are true, they could tart with one Mac and can double every few months. But, I guess there are also many people who do not have ready access to whatever a Mac mini costs either...

You can run the simulation out, but if the idea works, you can get to scaled revenue much faster than organic growth keeping 100% of the margin.

This is essentially the same reason even the best money managers take outside money to start, even if they eventually kick out the investors.


Because the "ship software to people, rent their hardware" model has zero up front investment required, presumably. And they don't have to deal with power, cooling, real estate.

"You could see a single robotaxi being worth, or providing, about $30,000 of gross profit per year. ... A Tesla is an appreciating asset..."

- Elon Musk during Tesla's Autonomy Day in April 2019.


Capital and availability?

I guess if it only works at scale capital is maybe the answer. Like enough cash to buy 5 or 10 or even 100 minis seem doable - but if the idea only works well when you have 10,000 running - that makes some sense.

Being the middleman is often way more profitable

Because their numbers don’t work out. When you do the math on token cost versus inference speed, you get something that barely breaks even even with cheap power.

Also they’ve already launched a crypto token, which is a terrible sign.


Django is bsd licensed. This is bsd licensed. I fail to see the issue?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: