More

alongub · 2026-04-20T20:17:19 1776716239

Alien is basically a huge state machine where every API call that mutates the environment is a discrete step, and the full state is durably persisted after each one.

If something fails mid-update, it resumes from exactly where it stopped. You can also point a deployment to a previous release and it walks back. This catches and recovers from issues that something like Terraform would just leave in a broken state.

For on-prem: we're working on Kubernetes as a deployment target (e.g. bare metal OpenShift)

rendaw · 2026-04-21T06:22:16 1776752536

How is this different from Terraform? Generally if something fails during a TF apply it saves the state of all the stuff that worked and just retries the thing that failed when you next run it. And reverting your TF stack and doing apply again should walk changes back.

There are specific things where that's not possible, and there are bugs, but it doesn't seem like what you said unless you meant that you just support a limited subset of resources that are known to be robust to reverts? But that's a fairly different claim.

alongub · 2026-04-21T06:31:05 1776753065

The main difference is granularity. Terraform runs a plan and applies it as a batch. If something fails, you re-run apply and it retries from the last saved state... but that state is per-resource, not per-API-call.

Alien tracks state at the individual API call level. A single resource creation might involve 5-10 API calls (create IAM role -> attach policy -> create function -> configure triggers -> set up DNS...). If it fails at step 7, it resumes from step 7. Terraform would retry the entire resource.

The other difference is that Alien runs continuously, not as a one-shot apply. It's a long-running control plane that watches the environment, detects drift, and reconciles. Terraform assumes you run it, it converges, and then nothing changes until you run it again.

pruthviraja · 2026-04-21T01:00:54 1776733254

i think the durable state machine approach is smart...that resume from where it stopped behavior is a big deal during incident response when you really dont want to rerun an entire deployment just because one step failed. K8s as a deployment target would be huge especially for the on-prem enterprise crowd. Will definitely keep an eye on that

alongub · 2026-04-21T05:31:44 1776749504

Thanks so much! If you have any other ideas, I'd really appreciate it if you could shoot them my way (alon AT alien dot dev)

alongub · 2026-04-20T19:51:53 1776714713

> Would you support simpler deployment targets, like on premises VMs etc?

https://github.com/alienplatform/alien/blob/main/crates/alie... :)

alongub · 2026-04-20T18:51:37 1776711097

At least they didn't ask you to TeamViewer into a Windows Server 2012 box and open Event Viewer..

stronglikedan · 2026-04-20T21:47:06 1776721626

That would be my preference compared to the situation you're replying to. Event Viewer is powerful if one takes some time to learn it.

alongub · 2026-04-20T21:51:40 1776721900

Fair point

alongub · 2026-04-20T18:47:05 1776710825

> Although unfortunately, we can't maintain _any_ connection back to our servers. Pull or push, doesn't matter.

We're working on something for this! Stay tuned.

alongub · 2026-04-20T18:16:22 1776708982

In practice, unmanaged self-hosting is often less secure, because you end up with outdated versions, unpatched vulnerabilities, and no one responsible for keeping things healthy.

More and more enterprise CISOs are starting to understand this.

The model here is closer to what companies like Databricks already do inside highly regulated environments. It's not new... it's just becoming more structured and accessible to smaller vendors.

OlivOnTech · 2026-04-20T18:40:02 1776710402

I don't agree, I see supply chains attacks as a bigger risk than outdated systems exposed only in the lan.

alongub · 2026-04-20T18:50:01 1776711001

Both are real risks. But supply chain attacks exist whether you self-host or not... you're still running the vendor's code either way. The question is whether you also want that code to stay up to date and properly managed, or drift silently.

nickmonad · 2026-04-20T19:07:11 1776712031

I agree that keeping things up to date is a good practice, and it would be nice if enterprise CISOs would get on board with that. One challenge we've seen is that other aspects of the business don't want things to be updated automatically, in the same way a fully-managed SaaS would be. This is especially true if the product sits in a revenue generation stream. We deal with "customer XYZ is going to update to version 23 next Tuesday at 6pm eastern" all the time.

alongub · 2026-04-20T19:35:42 1776713742

This is true even with fully-managed SaaS though. There are always users who don't want the new UI, the changed workflow, the moved button. But the update mechanism isn't really the problem IMO, feature flags and gradual rollouts solve this much better than version pinning

nickmonad · 2026-04-20T20:27:35 1776716855

Sure. I'm just saying in the context where fully-managed SaaS was already decided not to be an option, and a customer is deploying vendor code in their environments, the update mechanism can in fact be a problem. It's not just poor CISO management.

alongub · 2026-04-20T17:21:42 1776705702

Right, and that's when you do control the environment. Now imagine debugging that when it's your customer's infra, you have no access, and you're relying on them to copy-paste logs on a Zoom call.

alongub · 2026-04-20T17:12:02 1776705122

It's not RCE. The commands are predefined RPCs written into the deployed code. Customers can review and approve them. Trust between the vendor and the customer is still required and Alien doesn't make it unnecessary.

cassianoleal · 2026-04-21T08:38:04 1776760684

It may not be arbitrary code but it's still remote code execution.

The service provider has direct access to my infrastructure. It's one supply chain attack, one vulnerability, one missed code review away from data exfiltration or remote takeover.

alongub · 2026-04-20T17:08:47 1776704927

It's heavily inspired by Databricks' deployment model. And you're right that it's not "execute arbitrary commands". Commands are predefined functions in the deployed code that the developer defines upfront and customers can review.

The metrics/logs part is also core to Alien... telemetry flows back to the vendor's control plane so you actually have visibility into what's running.

alongub · 2026-01-29T17:33:34 1769708014

AgentMail looks amazing!

alongub · 2025-09-27T15:10:25 1758985825

Exactly