More

flail · 2026-04-20T23:13:27 1776726807

"Even without AI-generated code, [code review] is already a major failing."

By all means, yes. Yet, it feels like we were playing a catch-up game (to a degree), and no one intentionally shipped unreviewed code. Now, reviewed without comprehension becomes standard, unreviewed & unread increasingly happens. That's a different kind of reckless.

"Open source adds thousands of (unpaid) eyes to code review."

True. And the open source community sees a massive inflow of AI-generated pull requests, which floods their capabilities to review. Leaving the ecosystem as it is means it will be dead. Thus, I assume resistance or evolution. And we definitely see some of the former, with some open source codebases being closed for AI contributions.

"Hiring is now 'My AI versus Your AI; and the former real need of 'A qualified person for a suitable job' is lost in the fallout."

Yes, that's where hiring has headed. Which, coincidentally, has made everyone worse off (save for AI-for-hiring apps providers). Candidates have it harder to land a decent job. Companies talk to people who play the AI hiring game better, not the most suitable candidates. All while having the same number of candidates and the same number of jobs, but 100x as many resumes exchanged: https://brodzinski.com/2025/08/broken-ai-hiring.html

Which basically means that a resume has lost its value as a token of information exchange. And since we base the whole process on this very assumption (resume as a token of information), the system is due to be rewired eventually. And sooner rather than later. One random idea: how about creating limited traffic where people actually care at least enough to pay some token money: https://brodzinski.com/2025/12/pay-for-resume-read.html

"My humble suggestion is that our ultimate question be phrased as 'How much is enough?'"

Perfect question if we start from the grand scheme of things. I am afraid, though, that there is never enough. At some point, another billion means increased status. You could buy everything with the billions you had previously, so right now it's a virtual leaderboard between you and other billionaires. And the status game is, indeed, infinite. If you aren't winning now, you can chase the leader. If you are the leader, you try to escape the chase.

The "enough" question doesn't work just as well in a finer-grained context. If we want to figure out things like the evolution of a specific profession. Or consider how digital products will be built in the future. Or how well outsourcing your content generation to an AI agent would work in the long run.

flail · 2026-02-27T15:02:57 1772204577

Security is even a bigger issue than it looks at first glance. While security risk by omission was always a thing (AI or not), now we face a whole new level of risks, from prompt injection to creating malicious libraries to be used by coding agents: https://garymarcus.substack.com/p/llms-coding-agents-securit...

The most shallow security, however, seems easier. Now, you can get through an automated AI security audit every day for (basically) free. You don't have to hire specialists to run pen tests.

Which makes the whole thing even more challenging. Safe on the surface while vulnerable in the details creates the false sense of safety.

Yet, all these would be a concern only once a product is any successful. Once it is, hypothetically, the company behind should have money to fix the vulnerabilities (I know, "hypothetically"). The maintenance cost hits way earlier than that. It will kick in even for a pet personal project, which is isolated from the broader internet. So I treat it as an early filter, which will reduce the enthusiasm of wannabe founders.

pipejosh · 2026-02-27T15:50:51 1772207451

The automated audit only covers static analysis. When the agent actually runs, hitting MCP servers, making HTTP calls, getting responses back, that's where the real problems show up. Prompt injection through tool responses, malicious libraries that exfiltrate env vars, SSRF from agents that blindly follow redirects. Code audits miss all of it because this is a runtime and network problem, not a code quality problem.

Built Pipelock for this actually. It's a network proxy that sits between the agent and everything it talks to. Still early but the gap is real. https://github.com/luckyPipewrench/pipelock

flail · 2026-02-27T16:13:11 1772208791

Yes. And the more autonomously we create code, the more of these (and not only these) vulnerabilities we'll be adding. Combine that with the AI-automation in attacks, and you have an all-out security mess.

It's like a Petri dish for inventing new angles of security attacks.

Oh, and let's not forget that coding agents are non-deterministic. The same prompt will yield a different result each time. Especially for more complex tasks. So it's probably enough to wait till the vibe-coded product "slips." Ultimately, as a black hat hacker, I don't need all products to be vulnerable. I can work with those few that are.

pipejosh · 2026-02-27T16:18:58 1772209138

Agreed. The non-determinism makes traditional testing basically useless here. You can't write a test suite for "the agent decided to do something unexpected this time." Logging and runtime checks are the only way to catch the weird edge cases.

flail · 2026-02-19T17:02:21 1771520541

The question is not whether we like or want subscriptions, but rather whether we're used to them. And the answer is yes.

Given the choice, we'd be using Spotifys and Netflixes for free, and have ad-free Google. I don't expect that choice to be given to us.

AI tools won't change anything on that account. At best, we'll switch one subscription for another one, except that the latter will add a bill for the tokens we use.

flail · 2025-12-12T17:07:16 1765559236

There's a huge difference between nurses or teachers and Ivy League students. Namely, the former are not remotely as prestigious roles. I highly doubt there are 20 candidates for each nurse or teacher job.

Affirmative action happens when we discuss privileged positions. Spots at Ivy League colleges definitely are positions of privilege.

So if the situation under consideration were nursing, there wouldn't be such a discussion because there wouldn't be affirmative action in place.

flail · 2025-11-28T17:26:24 1764350784

> do Altman and Andreesen really believe that, or is it just a marketing and investment pitch?

As for Andreessen, I don't think he even cares. As the author writes:

"for the venture capitalists that have driven so much of field, scaling, even if it fails, has been a great run: it’s been a way to take their 2% management fee investing someone else’s money on plausible-ish sounding bets that were truly massive, which makes them rich no matter how things turn out"

VCs win every time. Even if it's a bubble and it bursts, they still win. In fact, they are the only party that wins.

Heck, the bigger the bubble, the more money is poured into it, and the bigger the commissions. So VCs have an interest in pumping it up.

flail · 2025-11-28T17:17:19 1764350239

> Have LLMs learned to say "I don't know" yet?

Can they, fundamentally, do that? That is, given the current technology.

Architecturally, they don't have a concept of "not knowing." They can say "I don't know," but it simply means that it was the most likely answer based on the training data.

A perfect example: an LLM citing chess rules and still making an illegal move: https://garymarcus.substack.com/p/generative-ais-crippling-a...

Heck, it can even say the move would have been illegal. And it would still make it.

pdimitar · 2025-11-29T18:13:44 1764440024

If the current technology does not allow them to sincerely say "I don't know, I am now checking it out" then they are not AGI, was my original point.

I am aware that the LLM companies are starting to integrate this quality -- and I strongly approve. But again, not being self-critical and as such lacking self-awareness is one of the qualities that I would ascribe to an AGI.

flail · 2025-11-28T17:07:29 1764349649

> We've got something that seems to be general and seems to be more intelligent than an average human.

We've got something that occasionally sounds as if it were more intelligent than an average human. However, if we stick to areas of interest of that average human, they'll beat the machine in reasoning, critical assessment, etc.

And in just about any area, an average human will beat the machine wherever a world model is required, i.e., a generalized understanding of how the world works.

It's not to criticize the usefulness of LLMs. Yet broad statements that an LLM is more intelligent than an average Joe are necessarily misleading.

I like how Simon Wardley assesses how good the most recent models are. He asks them to summarize an article or a book which he's deeply familiar with (his own or someone else's). It's like a test of trust. If he can't trust the summary of the stuff he knows, he can't trust the summary that's foreign to him either.

flail · 2025-11-28T16:56:07 1764348967

What's the lifecycle length of GPUs? 2-4 years? By the time OpenAIs and Anthropics pivot, many GPUs will be beyond their half-life. I doubt there would be many takers for that infrastructure.

Especially given the humungous scale of infrastructure that the current approach requires. Is there another line of technology that would require remotely as much?

Note, I'm not saying there can't be. It's just that I don't think there are obvious shots at that target.

flail · 2025-11-24T17:50:50 1764006650

> I stopped reading here, which is at the very start of the article (...) > (...) this article is low quality and honestly full of basic errors.

Just curious: How do you know it's full of errors, given that you stopped reading at the very start?

flail · 2025-11-24T17:48:20 1764006500

One more interesting aspect: the infrastructure doesn't age that well. We basically need to renew all that infrastructure every, like, 2-4 years or so? (And I think I'm being optimistic here.)