It's (at least partially) the layoffs. I've noticed significant degradation in the external-facing administrative layer at these companies. I recently did some work for a company that was trying to partner with Meta's e-commerce platform and even though there was a ton of documentation on how to integrate, etc. the human approval and planning piece of the project was completely dysfunctional on their side.
MS showing "view summary" button for all meetings, then doing bait-and-switch to tell you to buy Copilot license (on a corporate seat no less, where regular users don't have purchasing decision power) is top annoyance now
> Interesting you say the Dev isn't a great person, because I had a hunch when I saw the use of the Lena photo on the front page
You say:
> you guys are ruthless (...) You people are gross.
I'm not saying you don't have a point. I didn't know enough to be sensitive on the Lena topic once either, and could have been the target of the above comment. So I think, perhaps, those could have been formulated more constructively.
However, I must say the same for your comment too. Can't we all be friends here? :)
I agree that calling someone a bad person for using one of the most common test images is excessive. However, regarding this:
> The subject of the photograph merely went along with it.
The subject of the photograph did ask for it to no longer be used. Here's a quote from her:
> I retired from modeling a long time ago. It’s time I retired from tech, too.
> to defiantly do the opposite.
If the policing comes from third party for virtue signalling, this is fair game. Here, I'd just suggest that respecting her wish is just common courtesy and consider someone who defiantly doesn't as a somewhat rude person.
Yes, I'm aware of her statement. My view is that she merely went along with what I see as reactionary nonsense as opposed to actually caring about the use of her likeness. We all have a civic duty to actively push back against the spread of polarizing reactionary movements.
Even if I believed her request to be genuine I can't bring myself to view reproducing a commercial image of a professional model that's in widespread circulation as being unethical under any circumstances. Neither would I ever agree to stop distributing a well known book if one day many years later the author woke up suddenly wanting to undo its publication. If you find my viewpoint confusing or seemingly unreasonable, for reference I view projects such as Anna's Archive in a positive light.
While I strongly disagree with what I perceive to be the intent behind the image being banned by many journals, I nonetheless agree with the outcome. It's an objectively poor test image for demonstrating the technical capabilities of the vast majority of modern applications. We don't benchmark modern video codecs by encoding VHS rips of classic Disney movies and we shouldn't do the equivalent for still images.
I think this should not be attributed to malice, however unfortunate. I had also developed some sync app once and onedrive folders were indeed problematic, causing cyclic updates on access and random metadata changes for no explicit reason.
Complete lack of communication (outside of release notes, which nobody really reads, as the article too states) is incompetence and indeed worrying.
Just show a red status bar that says "these folders will not be backed up anymore", why not?
If the constant meta changes (or other peculiarities involving those folders) make the sync unusable, then it can be both. In that case, you stop syncing and communicate.
So my idea is that it's a competency problem (lack of communication), not malice. But it's just a theory, based on my own experience.
In any case, this is a bad situation, however you look at it.
You can't have the numbers go down like this. Try pitching S2 to your project manager. You will be told 100% "Why not use S4 instead, that sounds better"
It also breaks a lot of a11y tooling. It really helps a lot of people when developers care about semantic html.
I personally suggest web devs to install axe devtools [0] in their dev browser profile. Also, LLMs have gotten to the point that even the small local models can help a lot [1].
Writing this from a corporate win11 computer, the whole thing is so laggy, it's unbelievable. Last year, I had revived my old desktop from 2007 with an intel Q6600, windows xp and a clicky dying HDD, and that thing flied compared to this. Dear Microsoft and its partners (Especially DELL!), what the hell happened?!
Your actions, intentional and direct or not, allowed for one more sale of Win11 and an accompanying sad Dell computer, giving them the signal (however weak from you as one single individual) that whatever crap they have been doing up to now, still is a good choice in order to sell one of those combinations.
Not to shoot down your comment with sarcasm, I'm being really honest: I changed my shower gel with an expensive one this week, and it really had an unexpected, exciting effect. Small stuff can really have consequences much bigger than themselves.
That said, if you ever decide solve the tidying the toys problem, start a kickstarter, I pledge to pledge support! :D
Some people are not sensitive to quality. A car is a car, a shower gel is a shower gel, etc. In the computer world, they curiously congregate around Microsoft...
5.4, in my own testing, was almost always ahead of Opus 4.6 for reviews and planning. I'm on plus plan on openai, so I couldn't test it so deeply. Anyone who had more experience on both could perhaps chime in? Pros/cons compared to Opus? I'm invested in Claude ecosystem but the recent quality and session limits decrease have me on the edge.
Same for me. I'm on $20 plan for both and I use them both interchangeably. Similar "intelligence" imo. Just different way of doing things, that's all. But Claude is getting worse in terms of token usage so I've cancelled my plan last month.
Yeah it's probably a bit better overall. 5.4 is a month newer than Opus 4.6
My guess is that 5.5 will come out soon and be significantly better so you'd want to be using Codex then, but then when Opus 5 comes out probably back to claude code
Also 5.4 has fast mode, and higher usage limits since it's cheaper
I use opencode, so can toggle between Claude and Codex fairly easily, and do so whenever one of them is having problems (until yesterday, that is, when Claude blocked opencode for good, and I cancelled my account). This means I'm using the same prompts and instructions for both.
Personally, it seems like I have to redirect Opus/Sonnet much less often. GPT felt pretty "dense", it was more likely to ignore earlier instructions in the session, I had to remind it more often, and when I reviewed the code it produced I had to make more corrections that seemed obvious.
Entirely subjective, but I also find I prefer Claude's "personality" to ChatGPT, but I couldn't point to any specific differences.
Just curious as I've often heard that Claude was superior for planning/architecture work while ChatGPT was superior for actual implementation and finding bugs.
Claude makes more detailed plans that seem better if you just skim them, but when analyzed, has a lot of errors, usually.
It compensates for most during implementation if you make it use TDD by using superpower et al, or just telling it to do so.
GPT 5.4 makes more simple plans (compared to superpowers - a plugin from the official claude plugin marketplace - not the plan mode), but can better fill the details while implementing.
Plan mode in Claude Code got much better in the last months, but the lacking details cannot be compensated by the model during the implementation.
So my workflow has been:
Make claude plan with superpowers:brainstorm, review the spec, make updates, give the spec to gpt, usually to witness grave errors found by gpt, spec gets updates, another manual review, (many iterations later), final spec is written, write the plan, gpt finds mind boggling errors, (many iterations later), claude agent swarm implements, gpt finds even more errors, I find errors, fix fix fix, manual code review and red tests from me, tests get fixed (many iterations later) finally something usable with stylistic issues at most (human opinion)!
This happens with the most complex features that'd be a nightmare to implement even for the most experienced programmers of course. For basic things, most SOTA modals can one-shot anyway.
Interesting. Have you ever had Claude re-review its plan after having it draft the original plan? Or do you give it to GPT right away to review?
Just curious as I'm trying to branch out from using Claude for everything, and I've been following a somewhat similar workflow to yours, except just having Claude review and re-review its plan (sometimes using different roles, e.g. system architect vs SWE vs QA eng) and it will similarly identify issues that it missed originally.
But now I'm curious to try this while weaving in more GPT.
Maybe it's a size thing.
reply