More

hellohello2 · 2026-06-01T01:40:28 1780278028

You had sudo on your PC. You just didn't know ;)

hellohello2 · 2026-05-29T15:59:41 1780070381

I'm curious, have you tried working seriously with claude code or gpt codex and which part of it did you not enjoy? What makes you wish to write code like 2022?

toast0 · 2026-05-29T16:12:33 1780071153

Having watched people use these kinds of tools, it feels like trying to tell an intern to do a project.

Except with an intern, hopefully there's personal development and you only have to be very specific a few times. And the intern's manager gets good feels for helping someone grow, and maybe it's a hiring pipeline.

If I'm going to have to do that for everything, I would rather just do the work myself.

I have seen some sessions with let's call it over agressive autocomplete... That's mildly tempting, but I'm happy with my disintegrated development environment, and it doesn't have any way to do autocomplete at all, so that's not happening for me either.

hellohello2 · 2026-05-29T16:19:51 1780071591

Current SOTA is far past "agressive autocomplete" at this point, more like ask for a PR for a small feature and its done... I guess for me the fun is you can build a lot yourself, without relying on others. I hear you for the social aspect though & thanks for sharing your pov.

coldtea · 2026-05-29T16:01:59 1780070519

If you just like talking and some program comes out (aka "business problem solving") you might like it.

If you like coding (aka "problem solving"), it feels like crap.

And if you like still having an IT job in a couple of years, it feels like dangerous crap.

(Of course you can be hoping you'll be the one selected, out of millions laid off, to get to keep working on a higher level).

hellohello2 · 2026-05-29T16:16:20 1780071380

Perhaps its the apprehension/anxiety that makes it feel bad then? I like coding (building things) and couldn't care less about businesses, and am having a great time. In the current state of AI, mass layoffs probably won't happen. But I guess its a bit scary that we don't know how much more it will improve...

coldtea · 2026-05-29T21:29:37 1780090177

>Perhaps its the apprehension/anxiety that makes it feel bad then?

It's a big part. But the erosion of what coding means is another big part for some.

>I like coding (building things) and couldn't care less about businesses

There are people who like coding, but they mean "building things" by it, and other who like coding and mean "coding" itself. The latter we aren't as pleased.

(I also like building things, but I like building them via coding, not thru vibing and getting them spit out).

jon-wood · 2026-05-29T17:09:28 1780074568

> the current state of AI, mass layoffs probably won't happen.

I’m sorry, what? Have you been paying any attention at all to the state of the industry lately?

hellohello2 · 2026-05-29T18:16:38 1780078598

I wasn't clear enough, I was replying to "you'll be the one selected, out of millions laid off," in context I meant "mass layoff" as in "95% of everyone is out of a job permanently".

tonyedgecombe · 2026-05-30T08:12:07 1780128727

It will be somewhat ironic if the people losing out in this transition are the ones screaming about it’s benefits.

alchemism · 2026-05-29T16:20:04 1780071604

I’d probably let go of the employees who decline using agentic tools first, tbh. All things being equal.

claytongulick · 2026-05-29T17:53:06 1780077186

And that's the problem.

The companies that agree with you will be at an interesting place when they have piles of AI slop code and no talented developers.

alchemism · 2026-05-29T20:42:19 1780087339

I can’t say Claude Code is not a great product with solid market fit even though the code inside it is aesthetically garbage.

azangru · 2026-05-29T17:12:28 1780074748

> have you tried working seriously with claude code or gpt codex and which part of it did you not enjoy?

I haven't. But I found myself, to my surprise, not particularly interested in trying; which makes me wonder what motivates other developers if not peer pressure or demands for more productivity. I find coding interesting and fulfilling enough to do it on my own. I do ask LLMs questions from time to time, but for that, even a chatgpt or a gemini in a browser tab is enough.

The best experience I had so far is with code reviews, when the models pointed out my mistakes. But I haven't yet gotten to the point where I would want them to write code for me.

hellohello2 · 2026-05-30T23:24:11 1780183451

Just to share my perspective, I have not had this much fun programming as when I first learned to code. It's really something you have to try for yourself to actually understand. Its like a new form of programming, where code is "soft" instead of "hard"; on the whole feels similar, but also completely new.

The opinons on this site make me realize most people here are into programming for the money, rather than for the fun of building things. Which is completely fine, but it leads to most commenters being depressed rather than enthralled, which feels honestly confusing at times. Obviously socially things are looking pretty bleak but if you find coding fulfilling on its own, lets just say you can look forward to fulfillment lol

hellohello2 · 2026-05-28T18:02:24 1779991344

"It is almost guaranteed that a 60-90B model can outperform current SOTA in coding tasks within 2-3 years"

What insight do you have to make this claim?

roadside_picnic · 2026-05-28T18:14:50 1779992090

Have you personally used any of the latest batch of even smaller local models? They certainly don't beat SotA models at coding... but with a good harness they are able to achieve things with SotA that I couldn't last year.

I've repeatedly given local models non-trivial projects that involve research and coding which they've successfully completed with minimal intervention from me (almost exclusively in the domain of reviewing the results). Again, nothing comparable with current SotA, but definitely tasks I could not have given SotA models last year (without agent harness).

Now that pure progress from these models seems to have slowed down, we're seeing a ton of options for both making models more efficient and other tools that help improve them (everything from agent harnesses to RLVR).

That's just looking at "what can small do today", when you look at what's possible with larger open models that are still much smaller than SotA from the major providers, their performance is extremely close to SotA, enough that for personal projects I'll just use Kimi instead of any anthropic offerings.

So it's not terribly hard to image a solution in the middle happening within a few years. We still have tons to learn about optimal sizes of these models and how to build them with maximal efficiency (and we've already seen a lot of recent improvements in this space).

maccard · 2026-05-28T18:30:11 1779993011

> but with a good harness they are able to achieve things with SotA that I couldn't last year.

What happens if you run last years model in a SOTA harness? IME, the quality of the harness has a much more significant impact on the quality of the result, once you get past the initial hump of “can it do anything at all”

windexh8er · 2026-05-28T19:32:51 1779996771

I think this is a big component, but also context. A large factor in any model being able to handle complexity comes down to context length.

I think multiple SLMs driven by an orchestration frameworks (harness or otherwise) will ultimately displace LLMs. Right now we're in the era of diminishing returns with respect to LLM gains. Moving the needle percentages doesn't excite as many people anymore and with "reasoning" capabilities there's no reason why small distributed models can't be run more efficiently, especially if/when we start to see gains in modularized context management solutions.

coderenegade · 2026-05-30T08:13:40 1780128820

It's hard to know for sure. There are good information theoretic reasons to suspect that general models will always be better than smaller expert models, but maybe a MoE can claw some performance back, albeit with redundant computation. The properties of conditional entropy, for instance, always favor more generality. This assumes that the harness isn't a factor, or is at least equivalent across different models.

mswphd · 2026-05-28T20:11:28 1779999088

sure, but high-quality harnesses require less gpu compute/VRAM, and plausibly can be used locally by most users.

hellohello2 · 2026-05-28T22:59:22 1780009162

"Have you personally used any of the latest batch of even smaller local models?"

No I have not, which is why I asked (it wasn't a rhetorical question). Do you have pointers on what the recent improvements are?

blurbleblurble · 2026-05-29T01:00:32 1780016432

Try qwen 3.6 models with hermes and see for yourself. 27b is excellent and 35b is very good for basic agentic tasks.

sixothree · 2026-05-28T18:34:15 1779993255

Can you spare a sentence or two describing your local setup?

theplatman · 2026-05-28T19:40:44 1779997244

biggest thing i wish was present in more discussions about models is people providing more specifics on their setups vs. vague descriptions of harnesses

trees101 · 2026-05-28T21:44:13 1780004653

can you please share details about your harness

onlyrealcuzzo · 2026-05-28T18:10:32 1779991832

1. Context is all you need... They are heavily investing in getting better context (especially for coding tasks). This will disproportionately advantage smaller models (and benefit everyone).

A smaller model with better context today can outperform a model with 100x more parameters with bad or diluted context.

2. MoE (already abundant) + MLA (mostly memory efficiency, not quality) + Medusa (speed, not quality) + GRAM (5000-10,000x better reasoning in an extremely small model) + 1.58b (unclear if it will have the impact Microsoft first claimed - but possibly 5x).

knollimar · 2026-05-28T18:47:55 1779994075

Probably just "gemma was cool"

hellohello2 · 2026-05-28T17:48:04 1779990484

No, chatbots are LLMs trained for question-answering through RLHF (its not just a prompt). But yes, if you just zero-shot prompt a bare LLM you can still "talk to it" & you are correct on everything else as far as I know.

hellohello2 · 2026-05-28T17:45:57 1779990357

They are chatbots trained for tool use, its not just a prompt.

hellohello2 · 2026-05-22T16:48:37 1779468517

Computation halves in price every ~2 years so maybe in the short term but not in the long term

dawnerd · 2026-05-22T17:16:29 1779470189

How is that possible when the cost of memory and hard drives have gone up 3x+ in the last six months? Maybe cheaper if you're OAI or one of the lucky companies Nvidia is propping up. Everyone else is getting screwed.

hellohello2 · 2026-05-24T02:48:21 1779590901

Sorry but I don't really see how this contradicts what I said in context i.e. both our statements are compatible in the context of what I was replying to

ungreased0675 · 2026-05-23T13:01:04 1779541264

Even so, frontier models get bigger and more complicated, and agentic workflows consume exponentially more tokens than the simple chat windows of two years ago.

hellohello2 · 2026-05-24T02:47:04 1779590824

I agree that the amount people pay for these services is very unlikely to decrease (i.e. Blinn's Law but for tokens). Still, the current level of "intelligence" will eventually become available for a very low price almost surely. Really I simply don't see how you can disagree with the parent's comment "There is no world in which AI is not used extensively in all employment going forward." Honestly I'd like to understand the mindset, is it mainly that you dislike working with these tools/hope they don't get imposed on you or did you actually find them harmful in your work in some way and think they are overvalued?

hellohello2 · 2026-05-19T15:49:59 1779205799

There can be some issues with shadowing yeah, especially if you render with splatting/rasterization, but its fine if you raytrace I think.

hellohello2 · 2026-05-19T15:45:27 1779205527

There are some works on doing this directly e.g. https://arxiv.org/abs/2601.23065 but getting accurate materials is a challenge for anything more than diffuse.

AI-based relighting will no doubt start working soon.

hellohello2 · 2026-05-17T01:28:09 1778981289

"Text generated by an LM is not grounded in communicative intent, any model of the world, or any model of the reader’s state of mind."

Modelling text describing the world is not modelling (some aspect) of the world?

Modelling the probability that a reader likes or dislike a piece of text is not modelling (some aspect) of a reader's state of mind?

qsera · 2026-05-17T04:20:20 1778991620

>Modelling text describing the world is not modelling (some aspect) of the world?

The text describes the world to humans. This is the crucial thing that you miss. It is very subjective.

Imagine that you learn the grammar of a foreign language without learning the meaning of the words. You might be able to make grammatically valid sentences. But you will still will not understand a single thing that something written in that language describes. But that will be perfectly clear to someone who actually understand the meaning of the words.

When you train LLMs on large volumes of text that describe logically consistent facts in a million different ways, the "logic" sort of becomes part of the grammer that the model learns. That is logic becomes a higher kind of "grammer" or a enormous set of grammatical rules that it captures. But that does not mean the model can do actual logic.

NooneAtAll3 · 2026-05-17T12:07:42 1779019662

> Imagine that you learn the grammar of a foreign language without learning the meaning of the words. You might be able to make grammatically valid sentences. But you will still will not understand a single thing that something written in that language describes. But that will be perfectly clear to someone who actually understand the meaning of the words.

so... back to chinese room arguments?

just because amazon worker inside is just moving folders around following rules, doesn't by default mean the room as a whole can't be corresponding to "something that doesn't understand"

denying emergence as a phenomenon isn't useful when "there are plenty of higher abstraction levels in multiple fields that still capture 99% of events and are easier to model and react to" is the counterpoint

libraryofbabel · 2026-05-17T06:49:35 1779000575

> When you train LLMs on large volumes of text that describe logically consistent facts in a million different ways, the "logic" sort of becomes part of the grammer that the model learns. That is logic becomes a higher kind of "grammer" or a enormous set of grammatical rules that it captures. But that does not mean the model can do actual logic.

This is the kind of stuff people were saying in 2023. But it’s 2026 now and LLMs aren’t just trained by reading lots of text anymore. That’s “pretraining”, and it’s still the first stage, but LLMs also have a huge amount of RLVR training where they actually do solve huge numbers of mathematical and logic puzzles and update their weights in response. They don’t just learn mathematics from reading about it now. They learn it by doing it. That is why they can now solve hard problems and probe theorems.

> that does not mean the model can do actual logic.

But they do, all the time. (Please tell me you’ve at least put a frontier LLM through its paces in the last 6 months?) If you think they can’t do logic and reasoning, can you provide examples of specific math or logic problems that you think a frontier LLM can’t do?

qsera · 2026-05-17T07:01:05 1779001265

>If you think they can’t do logic and reasoning, can you provide examples of specific math or logic problems that you think a frontier LLM can’t do?

When a thing can "solve" a complex math problem without having the ability to count, then it is clear that this things is not "reasoning" and doing "logic".

libraryofbabel · 2026-05-17T18:47:12 1779043632

You didn’t answer my question. You just restated your claims.

Specific examples? Specific tasks?

hellohello2 · 2026-05-17T05:40:35 1778996435

Thanks for your explanation, I find it much more intuitive than the paper's.

In your opinion, does a Calculus solver model certain aspects of the world?

tootie · 2026-05-17T01:56:07 1778982967

No? There's no model involved. It's all just probabilistic. LLMs understand what you're thinking as well as a mood ring.

roenxi · 2026-05-17T02:13:50 1778984030

It isn't possible to have "just probabilistic" (maybe a philosophical exception could be made for a uniform random distribution or whatever provides the little dose of randomness required to get nondeterministic results). Probabilities are always in context of a model. LLMs model language but language itself is a model of something else. My money would have been on language modelling nonsense, but that is quite clearly not the case. Turns out it models the world and so do LLMs.

hellohello2 · 2026-05-17T02:03:23 1778983403

The literal definition of a model is "an informative representation of an object, person, or system". I think you mean something else though, what are you trying to express exactly?

aoeusnth1 · 2026-05-17T02:32:37 1778985157

The model is the thing which is learned in order to make the probabilistic prediction with low entropy.

tootie · 2026-05-17T18:25:35 1779042335

Well this is probably the same kind of semantic trap she's fighting with. Yes, you're right it's a model. The distinction is that they models of _language_ and not thoughts or feelings.

aoeusnth1 · 2026-05-18T07:07:48 1779088068

When I read your reply, I’m also modeling language. Tokens are just the discretization of the model’s eyes and ears. My brain does a huge amount of work to represent what’s happening in the world based on discrete information received from the outside world, just like language models do.

tootie · 2026-05-18T13:08:29 1779109709

Sure but you've also probably formed a model of who I am and what I'm thinking and formulated a response that isn't just grammatical and relevant but designed to provoke an outcome.

aoeusnth1 · 2026-05-18T20:08:01 1779134881

We're discussing whether they are models or not, not whether they have goals and agency. A language model does form a model of who you are and what you're thinking, because language is causally connected to those aspects of the generating distribution and modeling those aspects reduces cross-entropy.

RL provides the goals and agency. Pretraining provides the model.

afthonos · 2026-05-17T02:08:04 1778983684

Nothing about an LLM is “just”. In what precise sense do you mean it is probabilistic?

majormajor · 2026-05-17T04:12:41 1778991161

There's a reason stochastic was used in the original phrase instead of "probabilistic."

While most inference executions are intentionally non-deterministic, even a purely deterministic one would still be stochastic in that the model itself was built in a process such that the statistical frequency, sequencing, etc of the training text and followup processes all heavily influence the result.

Because of that, the output is the sort of thing that is not expected to generate 100% perfect output 100% of the time, but to have a good probability of being like-in-kind-to-the-training-data (and useful/relevant as a result).

(As compared to a non-stochastic model, like arithmetic on integers, where 2+2 is always gonna be 4 and you don't have a chance of coming up with some novel pair of inputs to addition that will cause your arithmetic to miss the mark.)

afthonos · 2026-05-18T17:02:46 1779123766

Agreed. My point was to question the use of “just“ to obscure an incredibly complicated process, which has been shown repeatedly to rely on generalizations that are indistinguishable from world models.

Now, it is true that the world they’re modeling is the world of tokens. But insofar as those tokens, be they text or images or videos, are themselves modeling the real world, LLMs do have a model of the real world.

hellohello2 · 2026-05-11T17:42:27 1778521347

I think the parent is mostly referring to solutions like Slang.D