More

anematode · 2026-04-23T22:27:05 1776983225

I don't think this is technically possible without something like homomorphic encryption, which poses too large of a runtime cost for usage in LLMs

I_am_tiberius · 2026-04-24T12:27:08 1777033628

They don't even try to proof it another way.

anematode · 2026-04-22T23:52:42 1776901962

Am I alone in thinking this stuff is nuts? (Currently half way through the article, btw.)

Analyzing "emotion" in the model is completely anthropocentric. If we indulge in the idea that LLMs of sufficient complexity can be conscious, then why is it any more likely that "emotion concepts" cause suffering any more than, say, reading ugly code? Maybe getting stuck in token loops is the most excruciating thing imaginable. The only logically coherent thing to do, if you're concerned about model welfare, is stop your training and inference.

Relatedly, I hope everyone involved in model welfare is an outspoken vegetarian, as that addresses a much more immediate problem.

justinclift · 2026-04-23T01:39:39 1776908379

> Analyzing "emotion" in the model is completely anthropocentric.

Yeah, asking a text generator designed to sound as-human-as-possible about its "welfare" then actually giving credence to the output is a category error.

It's like asking a ceramic mug with "Best Dad!" written on the side if I'm the best dad, then uncritically just believing the words painted there. :( :( :(

niobe · 2026-04-23T00:00:11 1776902411

Read the first few paragraphs, it's completely unhinged. Once I actually grasped what "model welfare" was, I noped out.

anematode · 2026-04-22T07:49:10 1776844150

My favorite (admittedly not super useful) trick in this domain is that sbb eax, eax breaks the dependency on the previous value of eax (just like xor and sub) and only depends on the carry flag. arm64 is less obtuse and just gives you csetm (special case of csinv) for this purpose.

not_a_bijection · 2026-04-22T14:31:55 1776868315

That's even more useful because of x86's braindamanged "setcc", which only affects the lowest byte of the destination, AFAIR, and so always has to be combined with a zeroing idiom before the setcc or a zero extension after it in practice.

anematode · 2026-04-21T18:59:44 1776797984

That part of his tweet made me laugh out loud. I don't understand who it's directed toward.

BoorishBears · 2026-04-21T19:13:37 1776798817

The market. Rauch is 'strategic' like that, he'd even use a moment like this sneak in a sound bite to froth the market he has so much skin in

"Vercel CEO says AI accelerated attack on critical infrastructure"

anematode · 2026-04-21T20:19:02 1776802742

sigh Right.

Ironically, if the timeline is true that the attackers had been inside for months, the AIs they had access to are substantially weaker than today's frontier models. How much faster would they have achieved their goals with GLM 5.1?

anematode · 2026-04-19T22:51:10 1776639070

It is sooooo laggy for me.

tom_ · 2026-04-19T23:09:27 1776640167

Runs nicely on my M4 Max Mac Studio - which, going by the PassMark numbers, is about the same speed as an iPhone 17. Testament, I think, to how well this site is optimised for the sort of underpowered device, hopelessly inadequate for modern workflows, that many sites would not bother to cater for.

anematode · 2026-04-18T06:39:17 1776494357

Excellent!! I love interval arithmetic and also wrote a TS implementation for a graphing calculator project. Agree that it's very underrated, and I wish that directed rounding was exposed in more languages.

fouronnes3 · 2026-04-18T07:02:43 1776495763

Yeah it's super interesting. Like you said, I learned that the IEEE 754 spec actually requires that complete implementations of floating point numbers expose a way to programmatically choose the rounding mode. As far as I know only C allows you to do that, and even then it depends on hardware support. For JS I had to use ugly typedarray casts. Which kinda only accidentally work due to endianess. But technically there should be an API for it!

There's other unused stuff in IEEE 754 like that: the inexact bit or signaling NaNs!

ted_dunning · 2026-04-19T00:17:49 1776557869

Julia supports full IEEE 754 rounding mode support.

anematode · 2026-04-17T01:51:15 1776390675

It's not quite the same as XOR swap, but a trick I've found handy is conditionally swapping values using XOR:

    int a, b;
    bool cond;
    
    int swap = cond ? a ^ b : 0;
    a ^= swap;
    b ^= swap;

If cond is highly unpredictable this can work rather nicely.

anematode · 2026-04-12T20:31:49 1776025909

Electricity is heavily regulated. Is there any evidence that LLMs will be the same?

signatoremo · 2026-04-12T23:50:49 1776037849

Was electricity regulated in the first decade of its existence?

SoftTalker · 2026-04-13T00:15:01 1776039301

I don't know but likely not. Factories were powered by steam then, and had a "power plant" on site. So they didn't convert to electricity until it was reliable and guaranteed.

altmanaltman · 2026-04-13T15:20:52 1776093652

Was anything regulated in those times? You could legally buy humans at that time.

But that doesn't mean we live with same standards. Lack of regulations in electricity led to a lot of deaths and disaster which is why it was regulated.

But we dont live in the start of 20th century, we live in 2026 and we must learn from the past instead of helbent on repeating it.

anematode · 2026-04-11T02:30:29 1775874629

Cool investigation. This part perplexes me, though:

> Games have apparently been using split locks for quite a while, and have not created issues even on AMD’s Zen 2 and Zen 5.

For the life of me I don't understand why you'd ever want to do an atomic operation that's not naturally aligned, let alone one split across cache lines....

toast0 · 2026-04-11T02:50:34 1775875834

> For the life of me I don't understand why you'd ever want to do an atomic operation that's not naturally aligned, let alone one split across cache lines....

I assume they force packed their structure and it's poorly aligned, but x86 doesn't fault on unaligned access and Windows doesn't detect and punish split locks, so while you probably would get better performance with proper alignment, it might not be a meaningful improvement on the majority of the machines running the program.

anematode · 2026-04-11T02:58:37 1775876317

Ah, that's a great hypothesis. I wonder, then, how it works with x86 emulation on ARM. IIRC, atomic ops on ARM fault if the address isn't naturally aligned... but I guess the runtime could intercept that and handle it slowly.

omcnoe · 2026-04-11T04:34:50 1775882090

ARM macs apparently have some kind of specific handling in place for this when a process is running with x86_64 compatibility, but it’s not publicly documented anywhere that I can see.

my123 · 2026-04-11T06:25:54 1775888754

XNU has this oddity: https://github.com/apple-oss-distributions/xnu/blob/f6217f89...

Redacted from open source XNU, but exists in the closed source version

omcnoe · 2026-04-11T06:36:35 1775889395

Is it actually redacted, or just a leftover stub from a feature implemented in silicon instead of software? Isn't the x86 memory order compatibility done at hardware level?

my123 · 2026-04-11T14:46:22 1775918782

Redacted

BobbyTables2 · 2026-04-11T03:05:37 1775876737

An emulated x86 atomic instruction wouldn’t need to use atomic instructions on ARM.

dooglius · 2026-04-11T03:27:42 1775878062

Why not?

MBCook · 2026-04-11T04:18:33 1775881113

They don’t have to match.

As an example, what about a divide instruction. A machine without an FPU can emulate a machine that has one. It will legitimately have to run hundreds/thousands of instructions to emulate a single divide instruction, it will certainly take longer.

Thats OK, just means the emulation is slower doing that than something like add that the host has a native instruction for. In ‘emulator time’ you still only ran one instruction. That world is still consistent.

anematode · 2026-04-11T04:48:28 1775882908

? That's not how Windows on ARM emulation works. It uses dynamic JIT translation from x86 to ARM. When the compiler sees, e.g., lock add [mem], reg presumably it'll emit a ldadd, but that will have different semantics if the operand is misaligned.

cylemons · 2026-04-11T13:27:28 1775914048

You mean the locking would be done in software?

phire · 2026-04-12T08:43:15 1775983395

They don't do it on purpose.

It's just really easy to do accidentally with custom allocators, and games tend to use custom allocators for performance reasons.

The system malloc will return pointers aligned to the size of the largest Atomic operation by default (16 bytes on x86), and compilers depend on this automatic alignment for correctness. But it's real easy for a custom allocator use a smaller alignment. Maybe the author didn't know, maybe they assumed they would never need the full 16-byte atomics. Maybe the 16-byte atomics weren't added until well after the custom allocator.

userbinator · 2026-04-11T08:45:17 1775897117

Packing structures can improve performance and overall memory usage by reducing cache misses.

anematode · 2026-04-11T00:21:12 1775866872

Unlikely – no one is starting off undecided, then reading one article in The New Yorker and then committing this. And it's a slippery slope to tie it to legitimate criticism.