I'd get confused if I was a LLM and you put my entire prompt in a text file attachment. I'd be like, "is this the user or is this a prompt injection??"
If you paste a long enough prompt into either GPT or Claude they turn it into an attachment, so it can happen. I think it's invisible to the model, but somehow not to the summarizer.
Errors compounding is a meme. In iterated as well as verifiable domains, errors dilute instead of compounding because the llm has repeated chances to notice its failure.
Well there can't be direct evidence, it's a private corporation and we don't know how big the model is. But you can look on Openrouter for hosters that offer free models with known sizes, where there's no brand and so no incentive to subsidize, and they don't look wildly bigger than OpenAI/Anthropic API prices.
edit: example: GLM 5.1, a 751B model, is offered for 0.6$/m in, 4.43$/m out. Scuttlebutt (ie. I asked Google's AI) seems to think that Opus 4 is a 1T/5T MoE model, so you can treat it (with some effort) as a 1T model for pricing purposes. Its API pricing is $1.55 in, $25 out, ie. 2x to 5x more than GLM. Idk what to say other than this sounds about right, probably with healthy margin.
I always avoided Ollama because it smelled like a project that was trying so desperately to own the entire workflow. I guess I dodged a bigger bullet than I knew.
This makes sense if Anthropic think they're the best-positioned to make safe AI. However if you are looking at an AI company there's obviously some selection happening.
reply