Since Cursor often relies on Claude models, some of those services will flow back to their own datacenter compute. Especially if there's, lets call it, "customer demand loadbalancing optimization agreements" that makes those Cursor services prioritize Claude models using the app keys that get load-balanced onto the SpaceX datacenter.
Did SpaceX just spend $10B to rent out its own datacenter, juicing their recurring revenue metrics with their own AI services investment?
With Anthropic's help. And when it's time for Anthropic to hype their IPO maybe SpaceX will return the favour and offer some deal that looks great to retail investors.
I don't think it's the conspiracy theory that you're making it out to be.
It is publicly known that the vast majority of deals in the AI space are circular in nature without the need for explicitly encoding any of it in a legal contract or even tacit agreements.
e.g. Nvidia has invested significantly in many AI companies including both Anthropic and OpenAI which rely heavily on Nvidia's hardware and will undoubtedly use some of said investment towards that end.
Nvidia and Oracle are already public companies, they're just aiming for their next quarterly statements.
SpaceX is getting dressed for their debutante ball and is putting on the makeup to make a grand entrance on the auction floor.
Is there a difference? I legitimately have no idea. You are right that we can add another entry to the list of interconnected circular dealmakings. All this ain't gonna end well next time the music stops playing.
Your argument is that since it is common in a bubble to make circular deals, there is no conspiracy. But you seem to suggest that people committing tens of billions of dollars aren’t looking any further down the pipeline than the name on the receiving bank account? Have you ever been anywhere near a large deal?
That's a lot to imply from my simple comment. My viewpoint is actually the exact opposite of what you claim: it all feels like a house of cards that is set to collapse at any moment. I can also tell you're quite passionate about this and I wonder if that emotion is clouding your interpretation of what was meant to be an innocuous comment.
My point was that there is a lot of this happening, it is not a unique statement nor is it surprising to see at this point.
I made no attempt to dismiss or justify any of it.
Sure, if "pretty smart" means overinvest in capital spending on an dirty datacenter powered by unpermitted gas generators that you don't even need anymore because of lack of demand for your product, so you lease it to a competitor (presumably at a huge loss). I am not sure that "major source of revenue" as a datacenter provider is the kind of growth opportunity that IPO investors are looking for.
Anthropic doesn't has that much pressure to pay while Musk has an IPO coming up and he wants to cleanup his numbers.
Its also not a good sign because he should be able to leverage Grok, his billion dollar investment, instead of renting it out to Anthropic. But hey what does it matter to investor? if the IPO explodes, it is clear that people either can't read, don't care or don't understand.
Says who? Oracle spends a lot of money to get ready for AI customers like OpenAI. They aren't there yet. They can't lose money serving what they don't have.
Its not even that. Its better to be involved in the game with a leader/help out a competitor who is competing against someone you don't like and don't want them to win, than to sit it out.
Are you worried about Google too? They're selling compute. Same with Microsoft, and Amazon. As far as I know Anthropic is really the only one that's compute-bound.
> As far as I know Anthropic is really the only one that's compute-bound.
I use gemini models daily. Jetbrains tells me when they are overloaded and switches to alternative (usually to openai which turns everything to shit). I'd say happens about fortnightly.
It's a good litmus and forecaster for AI demand and I wish we had more visibility.
Amazon is a bookseller and Google is just a web indexer. GCP didn't even open it's preview until 2008. Not sure why you think a business model is in any way a static thing.
Amazon is a compute specialist, their competitive advantage is in the compute business. And conversely they're not really trying to play in the AI business, so it's not at all suspicious that they don't want to use all their compute themselves.
If it’s very large, especially if the tool needs to refer to documentation for a lot of custom frameworks and APIs, you often end up needing very large context windows that burn through tokens faster.
If it’s smaller or sticks with common frameworks that the model was trained on, it’s able to do a lot more with smaller context windows and token usage is way lower.
The codebase and the topic you're working on are huge variables.
I don't use LLMs to write code (other than simple refactors and throwaway stuff) but I do use them heavily to crawl through big codebases and identify which files and functions I need to understand.
Some of the codebases I explore will burn through tokens at a rapid rate because there is so much complex code to get through. If I use the $20 Claude plan and Opus I can go through my entire 5-hour allocation in a single prompt exploring the codebase some times, and it's justified.
Other times I'm working on simple topics, even in a large codebase, and it will sip tokens because it only needs to walk a couple files to get to what it needs to answer my questions.
I'm currently in repos where the context window required is so large that the output is almost always "wrong" for the problem at hand. Quite a few people at my company burn through tokens this way, and it certainly isn't providing value to the company.
As always, improving accessibility for humans makes automation more effective. If the humans need to remember a PhD's worth of source code/documentation to contribute effectively, your codebase stinks.
People at my company have started writing docs specifically for claude. They're quite useful for me too, but kinda disappointing they never wrote these docs for their colleagues.
As someone who has written many docs, it's because 99% won't read it (rightfully so if it's verbose). You can turn that doc into a skill in a repo and Claude will read it everytime it's needed.
I recently saw this with the logseq api - the published api was an auto-generated stub. So I tried to grep the source code for the function and found detailed documentation written for claude. So I guess one benefit of all of this is that it's making people actually document things and maybe plan a little bit before implementing.
The LLM hype train has me reflecting on what a spoiled existence working in a ‘proper’ language provides though…
React devs, JS devs, front-end devs working on large sites and frameworks might be triggering tens of files to be brought into context. What an OCaml dev can bring in through a 5 line union type can look very different in less token-efficient and terse languages.
Begs the question if we should move on to minimal microservices so that whole project lives in context of llm. I hardly have to do anything when I'm working with small project with llm.
Why not take it a step further? Make each function in the codebase its own project. Then the codebase can fit into the context window easily. All you have to do is debug issues between functions calling each other.
I don't think it's a joke about left-pad, but the idea that the complexity increases tremendously when you take a cloud of "small" things all communicating with each other. You've just pushed the complexity elsewhere. Claude can easily crunch the small microservice, but you're pushing the complexity to communications issues, race conditions, etc.
Oddly enough I constantly run into the same issue on monolithic codebases too.
Things could just be one file but they end up being 12. I had to look through 12 levels of indirection for a single boolean recently. Twice, on two separate projects in the same week.
At least in a single codebase, that issue is at least theoretically solvable. At least the indirection wasn't split across 12 repos!
Like I said, if your work is already contained neatly inside one microservice then it doesn't matter.
The same would be true in a monolith: The context to understand what's happening would be contained to a few files.
When the work starts crossing through domains and potentially requiring insight into how other pieces work, fail, scale, etc. then the microservice model blows up complexity faster than anything, even if you have the API documented.
Ironically this is accidentally begging the question - that breaking them up into LLM context windows would be good because it would be to fit them in LLM context windows.
Maybe you're right but I'm aghast at how much of engineering over the last 15 years has been breaking up working monoliths to fit better within the budget of an external provider (first it was AWS). Those prices can change.
There are good reasons to use microservices but so often they're used for the wrong reasons.
I've done the opposite, moving multiple tightly coupled repos into a single monorepo. Saves the step of the llm realizing there's a bigger context, finding the repo, then also scanning/searching it. Especially for fixes that are simply one line each in two repos.
I'm a fan of the monorepo in general, even before LLMs. If using git it leverages git's best feature IMO, the commit as a snapshot of the entire repo. I've worked on so many projects where tightly coupled things are split across repos because it's thought of as a best practice, and it just makes it more difficult to figure out what code you are running.
Generally speaking no. Treat your IP (the code that runs your business, makes your business competitive or special) as precious and don't make it subservient to infra. It should be in the format (code, architecture, structure) that best serves it.
Yes, in a reasonable microservice land where the places you need to connect to are all documented in very concise places, you have have extremely productive $10 days. In the giant monorepo with everything custom, you can't just rely on built in knowledge of 80% of you libraries, so it's a very different world.
A place like Google has to be so much better off just training library concepts in, given how much of the things the LLM will "instinctively" reach for are unlikely to be available. Not unlike the acclimation period what happens when someone comes in or out of a company like that, and suddenly every library and infra tool you were used to are just not available. We need a lot more searching when that happens to us, and the LLM suffers from the same context issue. The human just has all of that trained in after a 6 months, but the LLM doesn't.
> A place like Google has to be so much better off just training library concepts in
They did that, there was a special version of Gemini fine-tuned on internal code. But then the main model moves so fast that it is hard to keep such fine-tunes up to date and on the latest.
Unquestionably good. They want a product that provides value anywhere it's tried so as to establish the reputation as a magic human replacement. Gaming consumption based pricing at this point would be quitting before the race is over. They can always tweak the pricing knobs later once the industry is fully hooked.
Why review thousands of lines of LLM generated code from some random person you don’t know when you could use an LLM yourself to do the same thing, except with probably a better design and more thoughtful approach?
Maintainers should get to spend their time developing stuff, not just reviewing low effort PRs. The flood of LLM code is changing the balance for the worse for maintainers, and I can totally see why they’d just want to ban it.
But at least you know how the sausage was made by the end. You have no idea how high or low quality any PR from a random person online is, and taking any amount of time to review a PR could be a total waste.
if someone made the same gigantic mess of a PR without LLMs, it would still be rejected, because it is a gigantic mess of a PR.
the low effort part is the problem. what if i made a great, focused, readable PR but had claude write it out? what if i carefully checked and deliberated each line, just as if i had written it myself?
granted, in the real world, 99.9% of slop PRs are written by LLMs. so i thought "okay, reasonable, ban the thing that is most likely to cause problems."
but then how does the "no LLM translators!" rule fit into that view?
It’s the lack of friction that LLMs bring. It’s easy to put in a couple of lines and generate 1000’s of lines of code. Whereas the person would never have done that without LLMs.
I think LLM dev needs to take a better spec driven approach. The vibing is getting to be annoying.
I think the only thing that will save us is smarter models. Slop coders are not going to stop making slop.
They’ll still use even smarter LLMs badly no doubt, but I’m thinking that maintainers of open source projects will be able to more effectively use LLMs to review potential PRs to weed out the truly bad ones quickly.
I guess they could setup a competent openclaw pr review agent. The problem is again - cost. Who is paying for the token usage by open source projects? How many tokens before they exhaust their quota with junk PRs?
Well previously lazy contributors simply would never have made a PR because it was too much work. Now they can have an LLM make a PR with virtually no effort at all.
It’s obviously an imperfect rule, and maybe it’ll change over time. But I am just saying that I understand why open source maintainers are doing this.
There is just no possibility for them to review all the low effort AI slop being thrown their way. Yes, some of it is going to actually be very high quality, but you don’t know that until you review it, which is the whole issue.
Agree, but "no LLMs" locks out good PRs and contributors too.
I hypothesize it may have roughly the same effect as denying all contributions where the author used intellisense 10y ago.
A substantial portion of people who write good code will be using some sort of LLM assistance, even if it is just something like Cursor Tab (autocomplete).
Yes, you'll also hit all of the spammy PR "contributors", but you'd also do so by prohibiting all contributions by people who have a belly button
It might be just our region, but for a long time we couldn't access current frontier models at all. Only old GPT4 level models. Meanwhile, Anthropic is rolling out access to every model within 24 hours of announcement to Bedrock.
Not sure how much Azure OAI has changed, but when I last used it 2-3 years ago, it was just a farce to get you using provisioned throughput. The throughput quotas were small, the process to request more was bureaucratic, and the Azure SAs were
It was also very clear the OAI and MS teams held each other in contempt (not relevant, but was interesting and grew in the immediate aftermath of the Altman drama).
So why were we using it? OpenAI don’t really have an enterprise go to market, bedrock still relied on Claude 2, and we weren’t willing to YOLO on clickthroughs.
Once Claude 3 came out, we jumped ship. That sucked too, although I hear it’s gotten better though.
I have been out of openai azure deployments for a whole,1year+, but we had often spikes in latency, escalated to the Head of Azure Europe, and still no official clues about them, meanwhile they were trying to get us in some kind of collaboration announcements. And it was the only reasons we had a few meetings with that guy.
So yeah Azure sucked ass and plenty of outages or latency, like 3min for first byte while usually it was max 30sec to 1min, if not even faster (memory is a bit fussy)
I don’t see anywhere that it’s something they specifically decided not to support. Probably they just haven’t gotten around to it yet? Multithreading is notoriously difficult to get right.
It says it isn't supported right in the readme. Just isn't clear on the "why" yet. Not getting to it yet is my hope. I maintain 14+ highly threaded ruby services atm, for context.
The EPA push for fuel efficiency made it easier to hit targets by selling huge trucks instead of small cars.
There is a value in safety regulation but the incentives as legislated have led to negative results. It needs to be fixed or repealed. Not sure there's a clean solution here.
Would it be possible to increase the cache duration if misses are a frequent source of problems?
Maybe using a heartbeat to detect live sessions to cache longer than sessions the user has already closed. And only do it for long sessions where a cache miss would be very expensive.
Not to mention that there are differences in ecosystem, familiarity, and ergonomics that may make a team want to stick with Python.
“Just use Go” is not really actionable advice in most cases.