More

gbnwl · 2026-04-30T17:15:44 1777569344

The symbol is not the thing. The map is not the territory. Ceci n'est pas une pipe.

gbnwl · 2026-04-28T05:02:25 1777352545

Not the first to notice this I'm sure but it feels like there's an insane amount of pressure pushing capital towards anything with a hint of AI legitimacy. It's as if asset owners across the planet have come to a consensus that the only industry that will matter going forward is this one (fair enough I guess), but this intense systemic pressure squeezes insane amounts of money toward litearlly any AI shaped outlet that opens up. It's just starting to feel like "scared and desperate" money more than "smart money".

iqp · 2026-04-28T05:23:07 1777353787

> the only industry that will matter going forward is this one (fair enough I guess)

Housing, healthcare, and food production all spring to mind as industries that matter waaaay more than AI! (≧ᗜ≦)

FuckButtons · 2026-04-28T05:53:41 1777355621

Not if all human labor becomes surplus to requirements.

johntash · 2026-04-28T23:40:43 1777419643

Matter more to who though? Probably not the people making most of the profits :)

tim333 · 2026-04-28T18:42:25 1777401745

Have AI robots provide those?

subscribed · 2026-04-28T09:33:21 1777368801

My bet is on ultra greedy trying to find the cure for death. They need best models for that.

ktallett · 2026-04-28T05:47:46 1777355266

Is it not a case of many funders don't want to risk missing out on the next big thing? And a loss of a few billion now is better than the loss of many billions down the line and control of the future?

gbnwl · 2026-04-28T06:16:04 1777356964

Of course the motivation makes sense on the surface. What I'm getting at is that the supply of capital vs the supply of potential "control of the future" plays feels incredibly imbalanced. Money seems to be so desperate to move into AI it's lost all prudence (the particular people and company mentioned in the OP nonwithstanding, maybe they do deserve 1B).

"not wanting to risk missing out" is essentially just FOMO right? "Smart" money has feels more like FOMO money these days. We literally have shoe companies savying they're going to pivot to AI and having their market cap increase in multiples as reward.

ktallett · 2026-04-28T06:46:54 1777358814

I don't think Silicon Valley has been smart money for a decade plus. Quantum Computing is becoming the exact same with academic and government funding. With a lot of cash being spent on long shots or no hopers.

rglover · 2026-04-28T14:36:21 1777386981

Mass mimetic movements typically fail to sustain long-term. The tech isn't going away, but a lot of the people betting their next move on it will.

gbnwl · 2026-04-26T22:03:49 1777241029

The entire post reads like it was generated via LLM as well.

josephg · 2026-04-26T22:20:29 1777242029

It clearly was, at least in part. Somehow, it feels just right here: Man trusts AI to do the right thing and it burns him. 5 minutes later, man trusts AI to explain what happened on X.

Its a greek tragedy in 2 acts.

justinclift · 2026-04-27T01:05:38 1777251938

> in 2 acts.

Might not be over yet... ;)

varun_ch · 2026-04-26T23:21:14 1777245674

I like the way the LLM implies that an API call should have a “type DELETE to confirm”. That would make no sense, and no human would ever suggest or want that, I hope.

dpark · 2026-04-26T23:53:03 1777247583

I can only assume (hope) this founder is completely nontechnical because the notion that an API should ask for someone to “type DELETE” is ridiculous.

gbnwl · 2026-04-26T21:55:53 1777240553

Never thought I'd see the day ragebait made it to HN. Yes, let's pretend doing a long jump on the moon is comparable to running a marathon at its prescheduled time at its prescheduled location. Weather is always a factor in sports that take place outside. Might as well put asterisks on all accomplishments that took place on sunny days by your logic right?

Noaidi · 2026-04-26T22:12:55 1777241575

It’s either scientific or it’s not.

Don’t forget that two people actually ran under the two hour mark.

ternaryoperator · 2026-04-26T22:32:02 1777242722

Not sure I understand what you mean by "scientific." If you mean exactly reproducible, then almost nothing in athletics fits that definition. Every record in baseball, football, etc. would fail that definition.

Noaidi · 2026-04-27T11:12:51 1777288371

Yes. That is what I meant. Sports records are nonsensical.

gbnwl · 2026-04-24T03:49:24 1777002564

I’m deeply interested and invested in the field but I could really use a support group for people burnt out from trying to keep up with everything. I feel like we’ve already long since passed the point where we need AI to help us keep up with advancements in AI.

satvikpendem · 2026-04-24T05:19:57 1777007997

Don't keep up. Much like with news, you'll know when you need to know, because someone else will tell you first.

roughly · 2026-04-24T16:18:16 1777047496

This one’s been particularly hard to sit out because the executive and managerial class are absolutely mainlining this stuff and pushing it hard on the rest of the organization, and so whether or not I want to keep up, I need to, because my job is to actually make stuff work and this stuff is a borderline existential risk to the quality of the systems I’m responsible for and rely on.

hnfong · 2026-04-25T02:28:28 1777084108

Thus, in the situation you described, "someone else will tell you first" is your boss.

vessenes · 2026-04-24T09:48:43 1777024123

This is only good advice if you don’t have the need to understand what’s happening on the edge of the frontier. If you do, then you’ll lose on compounding the knowledge from staying engaged with the major developments.

satvikpendem · 2026-04-24T13:16:09 1777036569

Not all developments are equal. Many are experimental branches of testing things out that usually get merged back into the core, so to speak. For example, I knew someone who was full into building their own harness and implementing the Ralph loop and various other things, spending a lot of time on it and now, guess what? All of that is in Claude Code or another harness and I didn't have to spend any amount of time on it because ultimately they're implementation details.

It's like ricing your Linux distro, sure it's fun to spend that time but don't make the mistake of thinking it's productive, it's just another form of procrastination (or perhaps a hobby to put it more charitably).

vessenes · 2026-04-26T03:28:26 1777174106

I agree that a full linux distro compile as a matter of practice is a waste of time. But, doing it a few times is good if you want to understand your tools.

I don’t believe that top tier engineers just skip learning things because they might turn out to be dead-ends or incorporated into tools by someone else; in my experience they tend to be extremely interested in things that seem like minutiae to others when working on the bleeding edge, often implementing their own systems just to more fully understand the problem space.

If it’s a day job for someone and they are not ambitious, fine. But we are at hacker news. I would bet 99%+ of top tier software talent could tell you practical experience with ralph loops this year, or a homegrown variety, simply because they are an attempt to solve a very real engineering problem (early exit, shitty code/incorrect responses, poor context window length and capacity), and top tier software people expect more control of their engineering environment, and success using their tools than they’d get by just saying ‘meh, whatever, I don’t get this and I’ll just wait it out.’

wordpad · 2026-04-24T03:55:35 1777002935

The players barely ever change. People don't have problems following sports, you shouldn't struggle so much with this once you accept top spot changes.

gbnwl · 2026-04-24T05:03:52 1777007032

I didn't express this well but my interest isn't "who is in the top spot", and is more _why and _how various labs get the results they do. This is also magnified by the fact that I'm not only interested in hosted providers of inference but local models as well. What's your take on the best model to run for coding on 24GB of VRAM locally after the last few weeks of releases? Which harness do you prefer? What quants do you think are best? To use your sports metaphor it's more than following the national leagues but also following college and even high school leagues as well. And the real interest isn't even who's doing well but WHY, at each level.

yorwba · 2026-04-24T08:03:36 1777017816

The technical report discussing the why and how is here: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main...

renticulous · 2026-04-24T06:08:43 1777010923

Follow the AI newsletters. They bundle the news along with their Op-Ed and summarize it better.

stef25 · 2026-04-24T09:37:24 1777023444

Tips on what newsletters are worth signing up for ?

anonymousDan · 2026-04-24T08:03:27 1777017807

Can you suggest some good ones?

namnnumbr · 2026-04-24T12:39:39 1777034379

I really like latent.space and simonwillison.com.

Also (shameless self-promo) I publish a 2x weekly blog just to force myself to keep up: https://aimlbling-about.ninerealmlabs.com/treadmill/

yorwba · 2026-04-24T08:04:23 1777017863

https://jack-clark.net/

ayewo · 2026-04-25T09:11:24 1777108284

Thanks for this!

Link to direct newsletter subscription: https://importai.substack.com/

ehnto · 2026-04-24T04:41:27 1777005687

It is funny seeing people ping pong between Anthropic and ChatGPT, with similar rhetoric in both directions.

At this point I would just pick the one who's "ethics" and user experience you prefer. The difference in performance between these releases has had no impact on the meaningful work one can do with them, unless perhaps they are on the fringes in some domain.

Personally I am trying out the open models cloud hosted, since I am not interested in being rug pulled by the big two providers. They have come a long way, and for all the work I actually trust to an LLM they seem to be sufficient.

dannyw · 2026-04-24T12:23:51 1777033431

Their financial projections that to a big part their valuation and investor story is built on involves actually making money, and lots of money, at some point. That money has to come from somewhere.

DiscourseFan · 2026-04-24T04:46:27 1777005987

I find ChatGPT annoying mostly

awakeasleep · 2026-04-24T04:49:40 1777006180

Open settings > personalization. Set it to efficient base style. Turn off enthusiasm and warmth. You’re welcome

2ndorderthought · 2026-04-24T11:02:30 1777028550

Yea but even then it's still annoying. "It's not about the enthusiasm and warmth but the general tone"

layer8 · 2026-04-24T13:58:43 1777039123

Setting “base style and tone” to “efficient” works fine for me.

notatoad · 2026-04-25T04:52:50 1777092770

I’m very satisfied with being three months behind everything in AI. That’s a level that’s useful, the overhyped nonsense gets found out before I need to care, and it’s easy enough to keep up with.

vrganj · 2026-04-24T05:34:02 1777008842

It honestly has all kinda felt like more of the same ever since maybe GPT4?

New model comes out, has some nice benchmarks, but the subjective experience of actually using it stays the same. Nothing's really blown my mind since.

Feels like the field has stagnated to a point where only the enthusiasts care.

ifwinterco · 2026-04-24T07:04:18 1777014258

For coding Opus 4.5 in q3 2025 was still the best model I've used.

Since then it's just been a cycle of the old model being progressively lobotomised and a "new" one coming out that if you're lucky might be as good as the OG Opus 4.5 for a couple of weeks.

Subjective but as far as I can tell no progress in almost a year, which is a lifetime in 2022-25 LLM timelines

_air · 2026-04-25T05:09:29 1777093769

Opus 4.5 was released on Nov 24 last year. It’s only been 5 months!

ifwinterco · 2026-04-25T07:18:12 1777101492

Wow you're right, okay not so bad then.

That brief two week period when Opus could eat entire tickets was simultaneously fantastic and a bit alarming

dannyw · 2026-04-24T12:28:18 1777033698

Another annoyance (for more API use) is summarized/hidden reasoning traces. It makes prompt debugging and optimization much harder, since you literally don't have much visibility into the real thinking process.

hnfong · 2026-04-25T02:32:48 1777084368

I don't trust the benchmarks either, so I maintained a set of benchmarks myself. I'm mostly interested in local models, and for the past 2 years they have steadily gotten better.

Can't argue with subjective experience, but if there were some tasks that you thought LLMs can't do two years ago, maybe try again today. You might be surprised.

dnnddidiej · 2026-04-24T10:00:00 1777024800

https://commoncog.com/how-to-make-sense-of-ai/

trueno · 2026-04-24T05:16:53 1777007813

holy shit im right there with you

gbnwl · 2026-04-20T23:57:16 1776729436

I'd wager that being conscripted in Norwary carries a different level of risk of deployment than being conscripted in the US, given the fact that we've been essentially been nonstop involved in wars for my entire lifetime.

When you were conscripted did you fear you might be sent to Iraq or Afganistan? It just feels like given our history an American conscript will litearlly always have some active warzone to possibly be sent off to. Our contries and our armies are not the same. Is Norway today chomping at the bit to send its soldiers to Iran? Or, per Trump, "our next conquest" Cuba? I really don't think you can think of being drafted into the American army the same way you think of the compulsory service of countries like South Korea or your own.

Being conscripted in a defensive army is materially different than being conscripted into one that takes every opportunity to engage in conflicts across the globe.

TrackerFF · 2026-04-21T08:58:46 1776761926

I did my service right around the time GWOT started, and it was around this time that our military started to focus more on transitioning to a professional (we do have professional units) military aimed at fighting terrorism in the middle east(Afghanistan/ISAF) as part of our NATO duties.

By the time you were finishing up your service (6-12 months depending on where you were stationed), you'd get a presentation on "the road ahead" if you wanted to continue military life: military school/college, become a professional soldier, etc.

With that said, I think maybe 10%-15% of the guys in our platoon decided to go with the Afghanistan route. IIRC that meant transferring / trying out for the professional battalion (TMBN), training for some time, and then deployed.

I don't think sending all conscripted soldiers to some foreign war will yield good results. But I do think that by the end of their service, some will be hyped up and "thirsty" enough to just go for it.

gbnwl · 2026-04-18T01:44:54 1776476694

Genuinely sorry he let you down and you're left holding the bag dude. But please understand people aren't going to accept your weak rationalizations anymore.

gbnwl · 2026-04-15T22:01:08 1776290468

Why is anyone still using or even talking about Gas Town? Now that HN is largely onboard with agentic development and has at least tried it themselves who's still under the impression that it's useful?

phillipcarter · 2026-04-15T23:56:25 1776297385

The value you get out of a simpler adversarial loop to critique your "main" agent's work is high. Stacking Steve Yegge's personal Kingdom of Nouns on top of each other doesn't add much more.

And this doesn't even begin to get into the madness that is verification for software that matters and is exposed through multiple modalities. You cannot let an agent just vibe its way around "does this business-critical thing with these specific use cases do its job correctly", much as Yegge might have you believe.

refulgentis · 2026-04-15T22:19:42 1776291582

I was about to post this same q, but saw yours and somehow that switched me from "wtf?" to "I have an answer.": There's just such interest in anything.

To wit, I still can't believe OpenClaw blew up, and it's much less......opinionated, than whatever is going on here. (deacons?)

Non-SWE TradMom™ posted on X™ yesterday about her OpenClaw that is set up with all her accounts so every morning she can get a family summary. She added a hunk with a bunch of stuff amounting to "PLEASE don't do anything insecure!", and the OpenClaw founder retweeted approvingly.

I left Google 3 years ago to build something. I'm very fond of the OpenClaw founder. And yet, absolutely cannot believe that he let such an obvious UX and security mess out into the world. We grew up in the same incubator (~2008 iPhone OS twitter) and presumably share the same values yet came to polar opposite conclusions.

Why do I view it as such a necessity to have a GUI/multiplatform/built in Willison Trifecta stuff that I'm still pounding away 2.5 years in and won't release, when, clearly you don't need that stuff?

I think in a steady state, product and UX discipline will win out. I bet within 3 months Gastown is a ghost town with maybe some non-technical crypto fans. In a year, OpenClaw is probably around, but not nearly the mindshare. It'll be quietly de-invested via OpenAI carefully managing the OpenClaw founder into working on their Everything App. (This is already happening: he got a nice PR interview with an OpenAI lead previewing the Everything App.)

Another anecdote re: demand:

My completely non-technical nurse ex-girlfriend from high school called me two weeks ago, for the first time in years. Lede was I was right about AI, and the substance was: via Claude Code, she built her own Ollama-based Mac Mini server that she could connect to remotely via an Expo app.

Does it work? Astoundingly, yes.

She also has no idea what is going on. She swears up and down that her AIs on Claude.ai, ChatGPT.com and Ollama are somehow talking to each other, and she does not mean APIs. She tried answering a Q I had about a graph visualization of her chats by talking to ChatGPT.com about it, even though Claude Code had wrote it, and I just didn't bother saying anything.

Times are strange.

gbnwl · 2026-04-13T23:23:04 1776122584

Have you considered not using copilot and using Claude Code or Codex directly?

i_love_retros · 2026-04-14T10:48:16 1776163696

I'm using codex latest model via copilot.

Is this where you say I'm holding it wrong? Is it so hard to admit these AI tools aren't as good as they are hyped up to be?

gbnwl · 2026-04-10T19:02:55 1775847775

So instead of using my Claude Code subscription, I can pay the vastly higher API rates to you so you can run Claude Code for me?

willydouhard · 2026-04-10T19:16:50 1775848610

Anthropic recently killed the ability for third parties to use the Claude Code subscription, and it's assumed they're subsidising that price heavily. Which is fine, but it's a good reminder of the vendor lock-in risk. One policy change and your workflow breaks. Twill is agent-agnostic (Claude Code, Codex CLI, OpenCode), so you're not betting on any single vendor's pricing decisions.

On the cost for solo devs, yeah, if you're one person running one agent at a time on your laptop, the sub is probably the better deal today. No argument there. The cloud agent model starts to make sense when you want to fire off multiple tasks in parallel.

gbnwl · 2026-04-10T19:17:52 1775848672

Not sure if you've seen it yourself but Claude code can kick off parallel agents working in their own worktrees natively now. I do it all the time.

willydouhard · 2026-04-10T19:24:27 1775849067

Yes, the difference is that Twill launches dedicated infra on each sandbox for each task. This means you can work on multiple tasks requiring a DB migration for instance.

Also you can fire and forget tasks (my favorite) and don't have to keep your laptop running at night.

verdverm · 2026-04-10T21:37:43 1775857063

See also Cowork and other upcoming Anthropic features.

See also Show HN, this exact product is frequently shown as a github link.

The paradigm shift in Ai means what you are making is (1) filling a gap until the primaries implement it, most have it in their pipeline if not already (2) how easy it is to replicate with said Ai using my preferred tech stack

willydouhard · 2026-04-10T22:09:01 1775858941

Cowork does not seem to be focused on engineering, but we are fully expecting Anthropic to catch up in this category.

What Anthropic can't offer is to let you use Codex or combine it with Claude Code. That is why we think non ai-labs players have a say in this market.

To your last point, as always there is a buy vs build tradeoff which ultimately comes down to focusing on your core business which we think still remains important in the ai era

verdverm · 2026-04-10T22:20:52 1775859652

> as always there is a buy vs build tradeoff

it's a nonbinary decision now

Google has a free, open source take on what you are building, looks more mature as well

https://googlecloudplatform.github.io/scion/overview/

My comment about Cowork is more about pointing out a different feature set that will crossover with Code. In example they have the Task related things as an affordance, Code has this coming.

willydouhard · 2026-04-10T22:44:35 1775861075

I believe there is a difference between an open source framework and a product. You would still have to manage and scale your infra, build the integration layer around it to make it accessible where your teams are, fix bugs etc...

I am not saying that build is always the bad choice, but the tradeoff did not disappear imo

verdverm · 2026-04-10T22:49:16 1775861356

I'm surprised how much you push back instead of dig in to understand more. I have heard mentor time is way down at YC since they stopped doing things that don't scale. You could be asking questions to better understand where you'd fit in with users and how to better position yourself. We are your market, how do we see the world now, post-ai?

gbnwl · 2026-04-10T23:43:21 1775864601

I’m newer to knowing and caring about what YC does at all in terms of the companies it funds. The fact that this is YC makes me think the org has forfeited any sense of “taste” at all. Complete scattershot from people who have money to scatter I guess.

verdverm · 2026-04-10T23:49:51 1775864991

You can read old Paul Graham essays and the early YC Startup School (which is probably when peak YC happened) to get a sense of the ethos. They increased batch size to scale (as context for the "stopped doing things that don't scale" comment)

https://www.startupschool.org/

dbbk · 2026-04-12T14:04:10 1776002650

Claude literally does this