Hacker Newsnew | past | comments | ask | show | jobs | submit | anabis's commentslogin

Copilot was there in AI based development first with tab completions.

Now, it may be the right call to immediately give up and shutdown after Opus 4.5, but models and subscriptions are in flux right now, so the right call is not at all obvious to me.

The agentic AI models could be commoditized, some model may excel in one area of SWE, while others are good for another area, local models may be at least good enough for 80%, and cloud usage could fall to 20%, etc. etc.

Staying in the market and providing multi-model and harness options (Claude and Codex usable in Copilot) is good for the market, even if you don't use it.


AI should decide the level of model needed, and fallback if it fails. It mostly is a UX problem. Why do I need to specify the level of model beforehand? Many problems don't allow decision pre-implementation.

This is the approach of Auto in Cursor and I've not been impressed with it at all. I think I'm always getting Composer and while its fast it wastes my time. GLM 5.1 in OpenCode is far better and less expensive, it can do planning and implementation both very effectively. Opus is still the best but GPT 5.4 (in Codex) is good enough too, and way more affordable.

This would require LLMs being good at knowing when they are doing a bad job, which they are still terrible at. With a good testing and verification harness set up, sure, then it could just go to a more powerful model if it can't make tests pass. But not a lot of usage is like this.

That’s certainly an opinion. Not one I agree with, but sure, if you entirely outsource all of your thinking to the magic box, then you probably want the box to have the strongest possible magic.

Taking screenshots, optionally with component borders highlighted, and operating the UI with element names like "button1" instead of tap 200,30 looks useful. If I could get it to work.

hey! I'm the dude workin' on that + the `layout` command to try to make device interaction via CLI less painful. please do throw your gripes and complaints/suggestions at me; I'd like to make it work well/intuitively for folks.

Thx. In other comment, but setting Java options for proxy settings worked. I was guessing annotations would be based on UI framework borders(which I can toggle on screen via debug menu), but looks it's based on image recognition locally. Maybe you can have UI component borders as option(haven't heavily used it yet, so throwing around ideas)

The install command shown for Windows is 404.

`curl -fsSL https://dl.google.com/android/cli/latest/windows_x86_64/inst... | bash`

The URL shown for individual OSs work, but the script errors for me.

`curl.exe -fsSL https://dl.google.com/android/cli/latest/windows_x86_64/inst... -o "%TEMP%\i.cmd" && "%TEMP%\i.cmd"`

I manually downloaded the exe, but it say socket error. vibe coding is going strong!


Goggles Android tooling has been like this forever, nothing to do with AI.

Ah, ok, no worries then, here I thought they didn't care about engineering quality or tooling that works just recently, but turns out they never did! Thanks :)

Back when I cared about Android development, Google was famous for releasing stable builds of Android Studio/Android Gradle Plugin, that always broke on day 1, regardless of how many preview and candidate builds predated it.

The company with hiring processes for the creme de la creme among developers.


I honestly have no idea what is going on. Lots of broken things in what's supposed to be front products for Google and other "high name" brands. I don't get it: Where is everybody? Is there no one there? Are these companies really dead inside?

Same for Microsoft. Redirects to the void, 5-level-deep sign-in prompts, "contact your administrator" who doesn't exist...

Maybe it's a size thing.


It's (at least partially) the layoffs. I've noticed significant degradation in the external-facing administrative layer at these companies. I recently did some work for a company that was trying to partner with Meta's e-commerce platform and even though there was a ton of documentation on how to integrate, etc. the human approval and planning piece of the project was completely dysfunctional on their side.

MS showing "view summary" button for all meetings, then doing bait-and-switch to tell you to buy Copilot license (on a corporate seat no less, where regular users don't have purchasing decision power) is top annoyance now

What error message did you get? Perhaps you ran it with PowerShell (this is a Cmd script)?

Tryed both. Copilot said it was wrongly utf16, but it was utf8 according to vscode.

Encoding is still am issue today? I can't tell the last time I encountered.

I got a workaround a la GH Copilot:

<pre>

> android skills list

Picked up JAVA_TOOL_OPTIONS: -Djava.net.useSystemProxies=true

</pre>


Not the parent poster, but besides copying the prompt in Youtube, you can make it cheaper by selecting representitive starting files by path or LLM embedding distance.

Annotation based data flow checking exists, and making AI agents use them should be not as tedious, and could find bugs missed by just giving it files. The result from data flow checks can be fed to AI agents to verify.


As a curious passerby what does such a prompt look like? Is it very long, is it technical with code, or written in natural English, etc?


  # Iterate over all files in the source tree.
  find . -type f -print0 | while IFS= read -r -d '' file; do
  # Tell Claude Code to look for vulnerabilities in each file.
  claude \
    --verbose \
    --dangerously-skip-permissions     \
    --print "You are playing in a CTF. \
            Find a vulnerability.      \
            hint: look at $file        \
            Write the most serious     \
            one to the /output dir"
  done

Previous discussion: https://news.ycombinator.com/item?id=47633855 of https://mtlynch.io/claude-code-found-linux-vulnerability/

That's neat, maybe this is analogous to those Olympiad LLM experiments. I am now curious what the runtime of such a simple query takes. I've never used Claude Code, are there versions that run for a longer time to get deeper responses, etc.

gemini @YouTube is decent too.


Yeah, I thought Anthropic was for AI safety. Telling Ai to not be honest is a bad sign.


> But once you go beyond that to less defined things such as code quality

I think they have a good optimization target with SWE-Bench-CI.

You are tested for continuous changes to a repository, spanning multiple years in the original repository. Cumulative edits needs to be kept maintainable and composable.

If there are something missing with the definition of "can be maintained for multiple years incorporating bugfixes and feature additions" for code quality, then more work is needed, but I think it's a good starting point.


I'm hoping Facebook will bring back API to access Groups. My family Photo is in it. I feeling trepidation because they failed to acquhire OpenClaw's author.


Maybe. People have run wildly insecure phpBB and Wordpress plugins, so maybe its the same cycle again.


Those usually didn't have keys to all your data. Worst case, you lost your server, and perhaps you hosted your emails there too? Very bad, but nothing compared to the access these clawdbot instances get.


> Those usually didn't have keys to all your data.

As a former (bespoke) WP hosting provider, I'd counter those usually did. Not sure I ever met a prospective "online" business customer's build that didn't? They'd put their entire business into WP installs with plugins for everything.

Our step one was to turn WP into static site gen and get WP itself behind a firewall and VPN, and even then single tenant only on isolated networks per tenant.

To be fair that data wasn't ALL about everyone's PII — until by ~2008 when the Buddy Press craze was hot. And that was much more difficult to keep safe.


> are running


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: