Hacker Newsnew | past | comments | ask | show | jobs | submit | organsnyder's commentslogin

I've found qwen3 to be very usable on my local machine (a Framework Desktop with 128gb RAM). I doubt it could handle the complex tasks I throw at Claude Opus at work, but it's more than capable of doing a surprising number of tasks, with good performance.

What tasks do you use qwen3 for? Coding? Are you running it on CPU or GPU? What GPU does that Framework have?

Thanks!


I have an Asus GX10 that I run Qwen3.5 122B A10B on, and I use it for coding through the Pi coding agent (and my own); I have to put more work in to ensure that the model verifies what it does, but if you do so its quite capable.

It makes using my Claude Pro sub actually feasible: write a plan with it, pick it up with my local model and implement it, now I'm not running out of tokens haha.

Is it worth it from a unit economics POV? Probably not, but I bought this thing to learn how to deploy and serve models with vLLM and SGLang, and to learn how to fine tune and train models with the 128GB of memory it gets to work with. Adding up two 40GB vectors in CUDA was quite fun :)

I also use Z.ai's Lite plan for the moment for GLM-5.1 which is very capable in my experience.

I was using Alibaba's Lite Coding Plan... but they killed it entirely after two months haha, too cheap obviously. Or all the *claw users killed it.


GLM 5.1 is extremely good, and ridiculously cheap on their coding plan. Its far better than Sonnet, and a fifth of the cost at API rates. I don't know if the American providers can compete long-term; what good is it to be more innovative it only buys them a six month lead andthey can't build the data center capacity fast enough for demand? Chinese providers have a huge advantage in electrical grid capacity.

True but Z.ai also just silently raised the price, and the entire Chinese frontier set is having to make profit now... hence Alibaba killing the Lite plan and not letting people sign up to their Pro one either; and why MiniMax has their non-commercial license, etc. etc.

So I agree with you, its better than Sonnet but way cheaper. I do wonder how long that will last though


Z.ai does really well at the carwash question!

Thank you. I've been using ollama for a much more modest local inference system. I'll research some of the things you've mentioned.

The Framework Desktop has a Ryzen 395 chip that is able to allocate memory to either the CPU or GPU. I've been able to allocate 100+gb to the GPU, so even big models can run there.

Most recently I used it to develop a script to help me manage email. The implementation included interacting with my provider over JMAP, taking various actions, and implementing an automated unsubscribe flow. It was greenfield, and quite trivial compared to the codebases I normally interact with, but it was definitely useful.


That's great. Ostensibly my system could also allocate some of the 32 GB of system memory to argument the 12 GB VRAM, but I've not been able to get it to load models over 20B. I should spend some more time on it.

Have you checked USB peripherals? I could see a keyboard or mouse causing a wake-up.

It affects my non-cheapo keyboard (Ultimate Hacking Keyboard) as well. Though thankfully I don't have the termination sequence issue.

I have a series of skills and scripts that I manually invoke using Pi. It checks my calendar, Slack, email, etc. and keeps pretty good track of my day, to the point that I trust it enough to get a lot of noise out of my head.

What kind of noise does it filter out for you?

Mainly all the tasks that I otherwise feel I have to track myself. I've found that my head feeling clearer is an indication that I trust a tool to not lose context.

It's especially gratifying (in my experience) to sing in a choir. I've been in the same chamber vocal group[1] for almost 20 years, and it has enriched my life in so many ways. I count the other members of the group as some of my closest friends.

Many areas have ensembles with a range of experience levels and commitment requirements. Now is a great time to look up local ensembles and see when they hold auditions (most of them are probably within the next few months).

[1] https://thechoralscholars.com


Of course. You're always capped by rate. But you're not capped by the cumulative amount (other than as a function of rate and time).

A couple of other options:

If their router supports it, configure the VPN there so it's available for the entire network.

Set up a Raspberry Pi (or similar) on their network that is configured with the VPN and runs a reverse proxy to expose the Jellyfin instance.

But yeah, either of those is going to increase your support burden.


I've never heard of this convention. Every getopt-style CLI tool I've used has identical behavior whether an option is specified in its short- or long-form.

Any rust cli built with clap or go cli built with cobra supports short and long help and surface these with `-h` and `--help` (I think cobra surfaces this in the help command rather than in the --help, which is probably a reasonable alternative way to frame this)

Jujutsu does it, and it's quite nice.

Keeping services running is fairly trivial. Getting to parity with the operationalization you get from a cloud platform takes more ongoing work.

I have a homelab that supports a number of services for my family. I have offsite backups (rsync.net for most data, a server sitting at our cottage for our media library), alerting, and some redundancy for hardware failures.

Right now, I have a few things I need to fix: - one of the nodes didn't boot back up after a power outage last fall; need to hook up a KVM to troubleshoot - cottage internet has been down since a power outage, so those backups are behind (I'm assuming it's something stupid, like I forgot to change the BIOS to power on automatically on the new router I just put in) - various services occasionally throw alerts at me

I have a much more complex setup than necessary (k8s in a homelab is overkill), but even the simplest system still needs backups if you care at all about your data. To be fair, cloud services aren't immune to this, either (the failure mode is more likely to be something like your account getting compromised, rather than a hardware failure).


You're spending that much time on it because you're doing too much. Your use of the term "homelab" is telling. I have:

* A rented VPS that's been running for ~15 years without any major issues, only a couple hours a month of maintenance.

* A small NUC-like device connected to the TV for media. Requires near-zero maintenance.

* A self-built 5-drive NAS based around a Raspberry Pi CM4 with a carrier board built for NAS/networking uses. Requires near-zero maintenance.

* A Raspberry Pi running some home automation stuff. This one requires a little more effort because the hardware it talks to is flaky, as is some of the software, so maybe 2-3 hours a month.

The basics (internet access itself) are just a commodity cable modem, a commodity router running a manufacturer-maintained OpenWRT derivative, a pair of consumer-grade APs reflashed with OpenWRT, and a few consumer-grade switches. There's no reason for me to roll my own here, and I don't want to be on the hook for it when it breaks. And if any of the stuff in the bulleted list breaks, it can sit for days or weeks if I don't feel like touching it, because it's not essential.

And yes, I've hard hardware failures and botched software upgrades. They take time to resolve. But it's not a big burden, and I don't spent much time on this stuff.

> I have a much more complex setup than necessary

Yup.

> Getting to parity with the operationalization you get from a cloud platform takes more ongoing work.

You don't need this. Trying to get even remotely there will eat up your time, and that time is better spent doing something else. Unless you enjoy doing that, which is fine, but say that, and don't try to claim that self-hosting necessarily takes up a lot of time.


It's definitely mostly a hobby, but I also want to get something close to the dependability of a cloud offering.

I started small, with just a Raspberry Pi running Home Assistant, then Proxmox on an old laptop... growing to what I have now. Each iteration has added complexity, but it's also added capability and resiliency.


A hidden cost of self-hosting.

I love self-hosting and run tons of services that I use daily. The thought of random hardware failures scares me, though. Troubleshooting hardware failure is hard and time consuming. Having spare minipcs is expensive. My NAS server failing would have the biggest impact, however.


Other than the firewall (itself a minipc), I only have one server where a failure would cause issues: it's connected to the HDDs I use for high-capacity storage, and has a GPU that Jellyfin uses for transcoding. That would only cause Jellyfin to stop working—the other services that have lower storage needs would continue working, since their storage is replicated across multiple nodes using Longhorn.

Kubernetes adds a lot of complexity initially, but it does make it easier to add fault tolerance for hardware failures, especially in conjunction with a replicating filesystem provider like Longhorn. I only knew that I had a failed node because some services didn't come back up until I drained and cordoned the node from the cluster (looks like there are various projects to automate this—I should look into those).


Canada also (at least some provinces). I have quite a few Canadian software engineer colleagues with their iron rings to prove it.


An iron ring does not technically make you an engineer in Canada. It just says you graduated from an engineering program. A P.Eng, which is a professional engineer's license is something you acquire after multiple years of experience and testing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: