Hacker Newsnew | past | comments | ask | show | jobs | submit | anematode's commentslogin

I don't think this is technically possible without something like homomorphic encryption, which poses too large of a runtime cost for usage in LLMs

They don't even try to proof it another way.

Am I alone in thinking this stuff is nuts? (Currently half way through the article, btw.)

Analyzing "emotion" in the model is completely anthropocentric. If we indulge in the idea that LLMs of sufficient complexity can be conscious, then why is it any more likely that "emotion concepts" cause suffering any more than, say, reading ugly code? Maybe getting stuck in token loops is the most excruciating thing imaginable. The only logically coherent thing to do, if you're concerned about model welfare, is stop your training and inference.

Relatedly, I hope everyone involved in model welfare is an outspoken vegetarian, as that addresses a much more immediate problem.


> Analyzing "emotion" in the model is completely anthropocentric.

Yeah, asking a text generator designed to sound as-human-as-possible about its "welfare" then actually giving credence to the output is a category error.

It's like asking a ceramic mug with "Best Dad!" written on the side if I'm the best dad, then uncritically just believing the words painted there. :( :( :(


Read the first few paragraphs, it's completely unhinged. Once I actually grasped what "model welfare" was, I noped out.

My favorite (admittedly not super useful) trick in this domain is that sbb eax, eax breaks the dependency on the previous value of eax (just like xor and sub) and only depends on the carry flag. arm64 is less obtuse and just gives you csetm (special case of csinv) for this purpose.

That's even more useful because of x86's braindamanged "setcc", which only affects the lowest byte of the destination, AFAIR, and so always has to be combined with a zeroing idiom before the setcc or a zero extension after it in practice.

That part of his tweet made me laugh out loud. I don't understand who it's directed toward.

The market. Rauch is 'strategic' like that, he'd even use a moment like this sneak in a sound bite to froth the market he has so much skin in

"Vercel CEO says AI accelerated attack on critical infrastructure"


sigh Right.

Ironically, if the timeline is true that the attackers had been inside for months, the AIs they had access to are substantially weaker than today's frontier models. How much faster would they have achieved their goals with GLM 5.1?


It is sooooo laggy for me.

Runs nicely on my M4 Max Mac Studio - which, going by the PassMark numbers, is about the same speed as an iPhone 17. Testament, I think, to how well this site is optimised for the sort of underpowered device, hopelessly inadequate for modern workflows, that many sites would not bother to cater for.

Excellent!! I love interval arithmetic and also wrote a TS implementation for a graphing calculator project. Agree that it's very underrated, and I wish that directed rounding was exposed in more languages.

Yeah it's super interesting. Like you said, I learned that the IEEE 754 spec actually requires that complete implementations of floating point numbers expose a way to programmatically choose the rounding mode. As far as I know only C allows you to do that, and even then it depends on hardware support. For JS I had to use ugly typedarray casts. Which kinda only accidentally work due to endianess. But technically there should be an API for it!

There's other unused stuff in IEEE 754 like that: the inexact bit or signaling NaNs!


Julia supports full IEEE 754 rounding mode support.

It's not quite the same as XOR swap, but a trick I've found handy is conditionally swapping values using XOR:

    int a, b;
    bool cond;
    
    int swap = cond ? a ^ b : 0;
    a ^= swap;
    b ^= swap;
If cond is highly unpredictable this can work rather nicely.

Electricity is heavily regulated. Is there any evidence that LLMs will be the same?

Was electricity regulated in the first decade of its existence?

I don't know but likely not. Factories were powered by steam then, and had a "power plant" on site. So they didn't convert to electricity until it was reliable and guaranteed.

Was anything regulated in those times? You could legally buy humans at that time.

But that doesn't mean we live with same standards. Lack of regulations in electricity led to a lot of deaths and disaster which is why it was regulated.

But we dont live in the start of 20th century, we live in 2026 and we must learn from the past instead of helbent on repeating it.


Cool investigation. This part perplexes me, though:

> Games have apparently been using split locks for quite a while, and have not created issues even on AMD’s Zen 2 and Zen 5.

For the life of me I don't understand why you'd ever want to do an atomic operation that's not naturally aligned, let alone one split across cache lines....


> For the life of me I don't understand why you'd ever want to do an atomic operation that's not naturally aligned, let alone one split across cache lines....

I assume they force packed their structure and it's poorly aligned, but x86 doesn't fault on unaligned access and Windows doesn't detect and punish split locks, so while you probably would get better performance with proper alignment, it might not be a meaningful improvement on the majority of the machines running the program.


Ah, that's a great hypothesis. I wonder, then, how it works with x86 emulation on ARM. IIRC, atomic ops on ARM fault if the address isn't naturally aligned... but I guess the runtime could intercept that and handle it slowly.

ARM macs apparently have some kind of specific handling in place for this when a process is running with x86_64 compatibility, but it’s not publicly documented anywhere that I can see.

XNU has this oddity: https://github.com/apple-oss-distributions/xnu/blob/f6217f89...

Redacted from open source XNU, but exists in the closed source version


Is it actually redacted, or just a leftover stub from a feature implemented in silicon instead of software? Isn't the x86 memory order compatibility done at hardware level?

Redacted

An emulated x86 atomic instruction wouldn’t need to use atomic instructions on ARM.

Why not?

They don’t have to match.

As an example, what about a divide instruction. A machine without an FPU can emulate a machine that has one. It will legitimately have to run hundreds/thousands of instructions to emulate a single divide instruction, it will certainly take longer.

Thats OK, just means the emulation is slower doing that than something like add that the host has a native instruction for. In ‘emulator time’ you still only ran one instruction. That world is still consistent.


? That's not how Windows on ARM emulation works. It uses dynamic JIT translation from x86 to ARM. When the compiler sees, e.g., lock add [mem], reg presumably it'll emit a ldadd, but that will have different semantics if the operand is misaligned.

You mean the locking would be done in software?

They don't do it on purpose.

It's just really easy to do accidentally with custom allocators, and games tend to use custom allocators for performance reasons.

The system malloc will return pointers aligned to the size of the largest Atomic operation by default (16 bytes on x86), and compilers depend on this automatic alignment for correctness. But it's real easy for a custom allocator use a smaller alignment. Maybe the author didn't know, maybe they assumed they would never need the full 16-byte atomics. Maybe the 16-byte atomics weren't added until well after the custom allocator.


Packing structures can improve performance and overall memory usage by reducing cache misses.

Unlikely – no one is starting off undecided, then reading one article in The New Yorker and then committing this. And it's a slippery slope to tie it to legitimate criticism.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: