More

robrenaud · 2026-04-20T15:17:42 1776698262

The flat earthers are why I hate astronomy.

Afaict, the grand parent poster is just very wrong. You do want to cause acute stresses to your heart (cardiovascular exercise) to get it work better.

groundzeros2015 · 2026-04-20T15:20:05 1776698405

It’s not really about this particular claim. It’s that I can read a comment that has a reasonable chain of logic and I don’t know if it’s true. This topic is just not easily studied and theories are hard to falsify.

groundzeros2015 · 2026-04-20T19:33:03 1776713583

Claims about flat earth are falsifiable with at-home experiment.

robrenaud · 2026-04-16T21:48:24 1776376104

Yeah, it's different. Anthropic profits when it delivers tokens. Hosting providers pay when Anthropic scrapes them.

robrenaud · 2026-04-01T05:57:30 1775023050

Yeah, my big problem with the paper is it just might be an artifact of qwen's training process.

taneq · 2026-04-01T10:39:27 1775039967

In all fairness most of the unique stuff I can do is probably an artifact of my training process, so it seems unfair to deny an LLM the same accomodation.

nativeit · 2026-04-01T13:25:42 1775049942

How much did your training cost society?

msdz · 2026-04-01T16:24:06 1775060646

This got me thinking, and it might actually even be a comparable amount. Let's estimate 12 years of schooling run at minimum $100,000 per student, at least in the US [1], and then add onto that number whatever else you may do after that, i.e. a bunch more money if paid (college) or "unpaid" (self-taught skills and improvements) education, and then the likely biggest portion for white-collar workers, yet hard-to-quantify, in experience and "value" professional work will equip one with.

Now divide the average SOTA LLM's training cost (or a guess, since these numbers aren't always published as far as I'm aware) by the number of users, or if you wanted to be more strict, the number of people it's proven to be useful for (what else would training be for), and it might not be so far off anymore?

Of course, whether it makes sense to divide and spread out the LLMs' costs across users in order to calculate an "average utility" is debatable.

[1] https://www.publicschoolreview.com/average-spending-student-...

robrenaud · 2026-03-10T15:36:17 1773156977

Was Alphago's move 37 original?

In the last step of training LLMs, reinforcement learning from verified rewards, LLMs are trained to maximize the probability of solving problems using their own output, depending on a reward signal akin to winning in Go. It's not just imitating human written text.

Fwiw, I agree that world models and some kind of learning from interacting with physical reality, rather than massive amounts of digitized gym environments is likely necessary for a breakthrough for AGI.

robrenaud · 2026-03-10T15:24:59 1773156299

Recursive self improvement. It's when AI speeds up the development of the next AI.

robrenaud · 2026-03-03T20:13:16 1772568796

Location: SF (current). NYC/Philly general area acceptable. Remote okay. email: rrenaud@gmail.com Resume: 16 year SWE -> MLE @ Google, MS from NYU with focus on ML. Retired. Now I hack on data analysis for video game projects for fun, and I love it. I'd take crazy low compensation to do work with interesting game data sets. EG, for game balance, strategic analysis, or to improve/augment game video content.

robrenaud · 2026-03-02T17:17:44 1772471864

What do y'all think about the latency/quality tradeoff with LLMs?

Human voices don't take 30 seconds to think, retrieve, research, and summarize a high quality answer. Humans are calibrated in their knowledge, they know what they understand and what they don't. They can converse in real time without bullshitting.

Frontier real time-ish LLM generated voice systems are still plagued by 2024 era LLM nonsense, like the inability to count Rs in strawberry. [1]

I'd personally love a voice interface that, constrained by the technology of today, takes the latency hit to deliver quality.

[1] https://www.instagram.com/reel/DTYBpa7AHSJ/?igsh=MzRlODBiNWF...

navanchauhan · 2026-03-02T18:22:50 1772475770

Not affiliated with Sesame, but this is what the realtime models are trying to solve. If you look at NVIDIA’s PersonaPlex release [0], it uses a duplex architecture. It’s based on Moshi [1], which aims to address this problem by allowing the model to listen and generate audio at the same time.

[0] https://github.com/NVIDIA/personaplex

[1] https://arxiv.org/abs/2410.00037

robrenaud · 2026-02-21T22:36:45 1771713405

Please serve well quantized models.

If you can get 99 percent of the quality for 50 percent of the cost, that is most times a good tradeoff.

robrenaud · 2026-02-20T18:22:07 1771611727

Cite a source. Your concrete claim is that, on average, for every $1 of subscription revenue on a monthly subscription, OpenAI and Anthropic were losing $11.50?

It seems completely implausible.

I could believe that if a $20 sub used every possible token granted, it would cost $250. But certainly almost no one was completely milking their subscription. In the same way that no one is streaming netflix literally 24/7.

robrenaud · 2026-02-09T03:57:08 1770609428

I’ve been experimenting with a live win probability predictor for the 10-player arcade game Killer Queen. The goal is to predict the winner in a causal, event-by-event fashion.

Right now I’m struggling to beat a baseline LightGBM model trained on hand-engineered expert features. My attempts at using a win probability head on top of nanoGPT, treating events as tokens, have been significantly worse. I am seeing about 65% accuracy compared to the LightGBM’s 70%. That 5% gap is huge given how stochastic the early game is, and the Transformer is easily 4 OOM more expensive to train.

To bridge the gap, I’m moving to a hybrid approach. I’m feeding those expert features back in as additional tokens or auxiliary loss heads, and I am using the LightGBM model as a teacher for knowledge distillation to provide smoother gradients.

The main priority here is personalized post-game feedback. By tracking sharp swings in win probability, or $\Delta WP$, you can automatically generate high or low-light reels right after a match. It helps players see the exact moment a play was either effective or catastrophic.

There is also a clear application for automated content creation. You can use $\Delta WP$ as a heuristic to identify the actual turning points of a match for YouTube summaries without needing to manually scrub through hours of Twitch footage.

matthewpick · 2026-02-09T04:56:32 1770612992

Big fan of this game. The arcade version is a blast if you can find it in your particular city.

Are you playing competitively (league play, tournaments)? Or just passionate about the game?

robrenaud · 2026-02-09T06:09:01 1770617341

I used to play very competitively, but I've been more chill recently. I just think it's a nice problem/dataset to work with, because of the depth of my understanding of the game.