Hacker Newsnew | past | comments | ask | show | jobs | submit | Eridrus's commentslogin

As a person doing design, yes, you feel like you cannot let go.

But as a person employing designers, I have already accepted that I will let go.

We did a marketing website redesign for our b2b saas product with a 3rd party design firm, we gave a lot of input, but the thing isn't perfect, at some point we had to call it done. It was still a significant improvement over what we previously had, but I am under no illusions that it is a masterpiece.

Now, coding tools do have some clear shortcomings for design atm, but how long they will be like that is not clear.


The only real example I can think of being bootstrapped is Airbnb, and even that wasn't bootstrapped for long.

Unless you have a good go to market strategy, you might want to try something easier.

At the risk of being overly critical, the cost of shipping packages is pretty low unless you're trying to do same day delivery, in which case Uber already lets you get your package delivered.


The article itself spells out several alternatives to buying continuous amounts of Helium: high temperature semiconductors and zero boil-off systems that don't require a continual supply.

All these "we're going to run out" stories pretend that engineering cannot adapt to changing cost structures, which is just total nonsense.

Sure, there is nothing that can be directly substituted for how we use Helium today, but clearly we're using Helium inefficiently today and the answer is that once markets force us to change, we will find more efficient ways.


The article also points out several cases where this isn't possible

The article is just helpfully illustrating how artisanal you can make your slop if you really try!


How are people building anything without evals?

Maybe I spent too much time in the ML mines, but it is somewhat inconceivable to iterate on a tricky problem without a eval set.


Beyond this, you are what you do.

And you are what you do for other people.

Besides providing support and entertainment for our friends and families, the concrete things we do that bring value to society are through our jobs.

Society doesn't run on hanging out or hobbies.


Cursor have said they are using Composer through their inference provider (Fireworks). Presumably the MIT is not viral like the GPL, so Cursor, and companies that use Cursor do not need to display Kimi attribution on their products.

It's definitely not what Kimi wanted, but it sounds like this is what is written.


Unrelated to FSD, what's a good example where frontier AI struggles with logical thinking that even stupid humans can figure out?

I personally feel like that isn't really true any more.


The recent one was should I drive my car to the car wash if it's only 300 feet from my house although it wasn't a slam dunk.


Right, but if these things are so rare that we all only know the one viral example, I feel like that lends credence to the models basically generally not having this problem.

Researchers built the Winnograd Schema Challenge more than a decade ago to assess common sense reasoning, and LLMs beat that challenge task around GPT 4.


They're not so rare. Hallucinations have been spotted everywhere, but the "driving a car to the car wash" is an amusing one that's been recently publicised. Developers aren't going to point out every time an LLM hallucinates an entire library.


I'd add to this, any moderately involved logical or numerical problem causes hallucinations for me on all frontier models.

If you ask them in isolation they may write a script to solve it "properly", but I guess this is because they added enough of these to the training set. But this workaround doesn't scale.

As soon as I give the LLM a proper problem and a small part of it requires numeric reasoning, it almost always hallucinates something and doesn't solve it with a script.

If the logic/math is part of a larger problem the miss rate is near 100%.

LLMs have massive amounts of knowledge, encoded in verbal intelligence, but their logic intelligence is well below even average human intelligence.

If you look at how they work (tokenization and embeddings) it's clear that transformers will not solve the issue. The escape hatches only work very unreliably.


What's a typical example?

I have been broadly quite happy with gpt 5.4 xhigh's reasoning on things like performance engineering tasks.


If you ask this of any current day AI it will answer exactly how you would expect. Telling you to drive, and acknowledging the comedic nature of the question.


That's because AI labs keep stamping out the widely known failures. I assume without actually retraining the main model, but with some small classifier that detects the known meme questions and injects correct answer in the context.

But try asking your favorite LLM what happens if you're holding a pen with two hands (one at each end) and let go of one end.



Are you also an LLM? Do objects often begin rotating when you're only holding them with one hand?


Not unlikely that you're talking to a lot of AI-based AI boosters. It's easier to create astroturfed comments with chatbots than fixing the inherent problems.


I always like to ask AI to generate a middle aged blond man with gray hair. Turns out that all models with gray have black roots.

https://chatgpt.com/share/69bcd01a-a750-800d-95f5-3b840b9ee2...

https://gemini.google.com/share/edc223bb6291 (the try again gave a woman, oops)

Even Midjourney couldn't do it.


Nice. My test was always a blond bald guy. It always adds hair. If you ask for bald you get a dark haired bald guy, if you add blond, you can't get bald because I guess saying the hair color implies hair (on the head), while you may just want blonde eyebrows and/or blond stubble.


It's not horrifically slow.


I think plenty of software is a pile of shit and still derive value from it.


Exactly, better the pile of shit you know than the pile of shit you don’t know - or the pile of shit that is u knowable.


Yeah I'd go so far as to say that most useful software is "bad" in some way.


Worse is Better


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: