More

ebiggers · 2026-04-30T00:04:57 1777507497

As someone who works on the Linux kernel's cryptography code, the regularly occurring AF_ALG exploits are really frustrating. AF_ALG, which was added to the kernel many years ago without sufficient review, should not exist. It's very complex, and it exposes a massive attack surface to unprivileged userspace programs. And it's almost completely unnecessary, as userspace already has its own cryptography code to use. The kernel's cryptography code is just for in-kernel users (for example, dm-crypt).

The algorithm being used in this exploit, "authencesn", is even an IPsec implementation detail, which never should have been exposed to userspace as a general-purpose en/decryption API.

If you're in charge of the configuration for a Linux kernel, I strongly recommend disabling all CONFIG_CRYPTO_USER_API_* kconfig options. This would have made this bug, and also every past and future AF_ALG bug, unexploitable. In the unlikely event that you find that it breaks any userspace programs on your system, please help migrate them to userspace crypto code! For some it's already been done. But in general, AF_ALG has actually never been used much in the first place, other than in exploits.

I don't think there's much other option. This sort of userspace API might have been sort of okay many years ago. But it just doesn't stand up in a world with syzbot, LLM-assisted bug discovery, etc.

still_grokking · 2026-04-30T00:40:40 1777509640

As I did not know what AF_ALG is in the first place I've searched for it and found this here:

https://www.chronox.de/libkcapi/html/ch01s02.html

It states the following:

> There are several reasons for AF_ALG:

> * The first and most important item is the access to hardware accelerators and hardware devices whose technical interface can only be accessed from the kernel mode / supervisor state of the processor. Such support cannot be used from user space except through AF_ALG.

> * When using user space libraries, all key material and other cryptographic sensitive parameters remains in the calling application's memory even when the application supplied the information to the library. When using AF_ALG, the key material and other sensitive parameters are handed to the kernel. The calling application now can reliably erase that information from its memory and just use the cipher handle to perform the cryptographic operations. If the application is cracked an attacker cannot obtain the key material.

> * On memory constrained systems like embedded systems, the additional memory footprint of a user space cryptographic library may be too much. As the kernel requires the kernel crypto API to be present, reusing existing code should reduce the memory footprint.

I can't judge whether this is a good justification, but there is one.

p_l · 2026-04-30T07:12:25 1777533145

AF_ALG if I remember correctly predates userspace-accessible crypto acceleration and was way more important back when it meant you had actual need for "SSL accelerator" cards in servers, among other things

neuroelectron · 2026-05-01T16:06:34 1777651594

Yes, I remember that time, it was back when I wasn't allowed to know anything about what servers were doing other than to look it up in the internal leak, which was never maintained

neuroelectron · 2026-05-01T21:18:03 1777670283

*intenral wiki

ryukoposting · 2026-04-30T10:33:50 1777545230

Hi, embedded firmware engineer here. I give it a B-

There's a weird area between the workloads that fit on a microcontroller, and the stuff that demands a full-blown CPU. Think softcore processors on FPGAs, super tiny MIPS and RISC-V cores on an ASIC, etc. Typically you run something like Yocto on a core like that. Maybe MontaVista or QNX if you've got the right nerd running the show.

So you have serious compute needs, and security concerns that justify virtual memory. But you don't have infinite space to work with, so hardware acceleration is important. Having a standard API built into the kernel seems like a decent idea I guess.

And yet, I've never heard of AF_ALG. I've never seen it used. The thing is, if you have some bizzaro softcore, there's a good chance you also have a bizzaro crypto engine with no upstream kernel driver. If you're going to the trouble of rolling your own kernel with drivers for special crypto engines, why would you bother hooking it into this thing? Roll your own API that fits your needs and doesn't have a gigantic attack surface.

rcxdude · 2026-05-05T12:55:29 1777985729

This suggests that it is useful in some niche embedded use- cases, but should probably not be enabled by default on most desktop/server kennels

buckle8017 · 2026-04-30T01:54:45 1777514085

You should take note that this is written by the person that wrote the bad patch.

So grain of salt.

still_grokking · 2026-04-30T02:01:27 1777514487

I've said I'm not sure about the validity of that reasoning.

I've liked it nevertheless for context, as augmentation to parent's post.

asveikau · 2026-04-30T03:50:16 1777521016

I feel like it should be possible to fulfill these advantages with a minimal, not very complex API. I.e. the grandparent's comment about IPsec implementation details doesn't make the cut, but a hardware accelerated cipher implementation does.

amluto · 2026-05-01T15:08:37 1777648117

A hardware accelerated DMA-capable cipher implementation is an odd thing, and it’s generally not useful on its own. You might want to set up a whole chain of operations (encrypt, checksum, send to network, for example), but I’ve never encountered a case where you actually want to ask an accelerator to asynchronously encrypt application data and return the encrypted data to the application.

hansvm · 2026-05-05T00:58:04 1777942684

Unless you're pushing a ton of extra work into a network-capable accelerator, that sounds exactly like what you'd want for, e.g., an encrypted S3 implementation. You have encryption, RS encoding, striped checksumming, sending fragments to multiple hosts, some sort of potentially interesting partial failure handling, etc.

You could push that all down to the accelerator, but if there are even a few such use cases you might want a dedicated DMA-capable implementation instead.

mihaaly · 2026-04-30T08:52:00 1777539120

But is it true or not? Whoever wrote it. (for objective truth the subjects are unimportant)

buckle8017 · 2026-04-30T12:22:39 1777551759

It might have been true in 2002 but it hasn't been true since at least about 2010.

You've almost certainly never had a system that supported any hardware accelerated crypto that also required a kernel module.

It's much easier to expose as cpu extensions.

skywhopper · 2026-04-30T09:24:08 1777541048

When you can’t know the objective truth or when there isn’t one (as is the case in making decisions about security tradeoffs in software design), knowing the source of the argument is vital to interpreting its validity.

bawolff · 2026-04-30T12:47:51 1777553271

I disagree 100%. Software security tradeoffs are definitely the sort of thing where you can evaluate arguments on their merits.

buredoranna · 2026-04-30T02:36:20 1777516580

Please don't rely on my judgement for this being safe for production, but after blacklisting the modules, the provided python exploit failed.

Check if the following are modules

  grep CONFIG_CRYPTO_USER_API /boot/config-$(uname -r)

If they are, you can try blacklisting them

  /etc/modprobe.d/blacklist-crypto-user-api.conf
  
  """
  blacklist af_alg
  blacklist algif_hash
  blacklist algif_skcipher
  blacklist algif_rng
  blacklist algif_aead

  install af_alg /bin/false
  install algif_hash /bin/false
  install algif_skcipher /bin/false
  install algif_rng /bin/false
  install algif_aead /bin/false
  """

  update-initramfs -u

Can anyone comment on the ramifications this?

ebiggers · 2026-04-30T02:48:05 1777517285

If iwd, or cryptsetup with certain non-default algorithms, isn't being used on the system, you should be fine. Not many programs use AF_ALG. It's possible there are others I'm not aware of, but it's quite rare.

To be clear, general-purpose Linux distros generally can't disable these kconfig options yet, due to these cases. But there are many Linux systems that simply don't need this functionality.

A good project for someone to work on would be to fix iwd and cryptsetup to always use userspace crypto, as they should.

400thecat · 2026-04-30T05:01:30 1777525290

is CONFIG_CRYPTO_USER_API needed for hw acceleration for cryptsetup (dm-crypt) disk encryption ?

ebiggers · 2026-04-30T05:09:21 1777525761

No, dm-crypt just calls the kernel's crypto code directly.

strenholme · 2026-04-30T04:41:08 1777524068

I can’t comment on the ramifications, except to note that elsewhere in the thread this appears to not break anything (whether it makes userspace crypto a little less safe is academic, but that doesn’t matter if we have an easy local root shell), but I can verify the above fix does protect Ubuntu 24.04 from the exploit.

Just reboot after applying this change.

Milpotel · 2026-04-30T07:00:03 1777532403

Or

  zgrep CONFIG_CRYPTO_USER_API /proc/config.gz

globular-toast · 2026-04-30T07:42:32 1777534952

Is it built as a module in most distros?

dsr_ · 2026-04-30T12:44:48 1777553088

It is built as a module in Debian.

lsmod shows it is not loaded on any of the Trixie or Bookworm machines I have checked, Intel or AMD.

tomxor · 2026-04-30T16:03:40 1777565020

FYI it's dynamically loaded on demand, so lsmod will show it after you try run the exploit, or you can explicitly load it with:

  modprobe algif_aead

The following mitigation (from the article) does work for Debian 12 and 13, I've tested this:

  echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif.conf
  rmmod algif_aead 2>/dev/null || true

First line blocks it from loading, second line is unloading it if it's already been loaded. You can test with the same "modprobe algif_aead".

dundarious · 2026-04-30T20:52:16 1777582336

The point of noting whether it is loaded on their machine or not, is presumably to indicate that it is not normally loaded (for them), so disabling it to block the exploit should have no impact (for them).

globular-toast · 2026-04-30T19:46:55 1777578415

It was loaded on my Ubuntu system so I wonder what used it.

alexdevphp · 2026-05-02T17:59:58 1777744798

As I understands any program code can use that socket to write to page cache memory and modify any main program. Even php code can be written for that. So it is serious problem if there is other security hole on web server.

PunchyHamster · 2026-05-01T22:12:23 1777673543

over 500 servers with very varied workload i manage didn't had this module loaded so my guess is "near zero"

also only algif_aead is vulnerable

alpn · 2026-04-30T00:17:57 1777508277

For anyone wondering: AF_ALG is a Linux socket interface that exposes the kernel’s crypto API via file descriptors, using normal read(2)/write(2) calls for hashing and encryption.

dnnddidiej · 2026-04-30T04:00:21 1777521621

I wonder can the kernel just remove it and distros put on a compatiability layer.

TheDong · 2026-04-30T04:55:38 1777524938

It's already a configurable option in the kernel which can be fully disabled by distros if they wanted to provide their own compatibility layer, or just not ship any software that has a hard dependency on it.

adrian_b · 2026-04-30T09:49:59 1777542599

I always use only custom compiled kernels on my computers, where I enable only the configuration options that I really need.

So the options related to AF_ALG have always been disabled, because I have not encountered an application that needs them, among those that I use.

Unfortunately the Linux distributions must enable in their default configuration most options, because they cannot predict what their users will need.

l1k · 2026-04-30T03:27:00 1777519620

It does enable address space separation of secret keys from user space, which some people love:

https://blog.cloudflare.com/the-linux-kernel-key-retention-s...

https://www.youtube.com/watch?v=7djRRjxaCKk

https://www.youtube.com/watch?v=lvZaDE578yc

So it's not as simple as "should not exist". I agree though that there doesn't seem to be a valid need to expose authencesn to user space.

Disclosure: I'm co-maintaining crypto/asymmetric_keys/ in the kernel and the author/presenter in the first two links is another co-maintainer.

ebiggers · 2026-04-30T03:56:17 1777521377

That can be done in userspace too -- different userspace processes have different address spaces too.

The fact that the first link recommends using keyctl() for RSA private keys is also "interesting", given that the kernel's implementation of RSA isn't hardened against timing attacks (but userspace implementations of RSA typically are).

ngomez · 2026-04-30T04:09:17 1777522157

The CloudFlare blog discusses that idea when they talk about having an "agent process" to hold cryptographic material, but they list drawbacks like having to develop two processes, implement a well-defined interface, and enforce ACLs. I'm not convinced that "developing two processes" is a reason not to do it, since the kernel is effectively just the second process now, but everything else makes sense.

It's unfortunate though since this is one thing I think Windows does decently well. The Windows crypto and TLS APIs do use a key isolation process by default (LSASS) and have a stable interface for other processes to use it [0]. I imagine systemd could implement something similar, but I also know that there are very strong opinions about adding more surface area to systemd.

[0] https://blackhat.com/docs/us-16/materials/us-16-Kambic-Cunni...

lostmsu · 2026-04-30T11:24:14 1777548254

TBH LSASS is privileged enough to be a good target for exploits.

l1k · 2026-04-30T04:30:40 1777523440

> the kernel's implementation of RSA isn't hardened against timing attacks

Cloudflare is using custom BoringSSL-based crypto code in the kernel:

https://lore.kernel.org/all/CALrw=nEyTeP=6QcdEvaeMLZEq_pYB9W...

400thecat · 2026-04-30T08:14:05 1777536845

can you please give me a real-life example of an application, on a typical linux laptop or typical linux server, which userspace application would use this CRYPTO_USER_API ? None that I looked at seem to use it: openssl, pgp, sha256sum

l1k · 2026-04-30T08:34:54 1777538094

As Eric has correctly stated above, we believe iwd (Intel Wireless Daemon), or rather the ell library it relies on (Embedded Linux Library) is the only relatively widespread user space application relying on it.

XorNot · 2026-04-30T09:39:26 1777541966

Isn't the better argument to ask whether there'd be benefit if all those things did?

A lack of adoption isn't apriori a good argument against an interface, and serious bugs can happen anywhere.

My personal opinion for a while has been that crypto operations should be in the kernel so we can end the madness that is every application shipping it's own crypto and trust system which has only gotten worse since containers were invented.

acdha · 2026-04-30T12:54:27 1777553667

> My personal opinion for a while has been that crypto operations should be in the kernel so we can end the madness that is every application shipping it's own crypto and trust system which has only gotten worse since containers were invented.

There’s a valid argument here but I think that’d devolve into the DNSSec trap without both a very well-designed API and a stable way to ship updates for older kernels. If people can’t get good user experience or have to force kernel upgrades to improve security, most applications will avoid it. Things like Chrome shipping their own crypto mean that they can very quickly ship things like PQC without waiting years or having to deal with issues like kernel n+1 having unrelated driver or performance issues which force things into a security vs. functionality fight.

XorNot · 2026-04-30T13:53:34 1777557214

Which does sort of loop around to the issue of Linux not having a stable ABI as a feature I suppose which would be one way to implement it with long term compatibility on kernel modules.

But the Chrome example also highlights the problem: Chrome might ship it, but vanishingly little software is ever going to upgrade and we've got an explosion of statically linked languages now.

MarsIronPI · 2026-05-01T03:05:30 1777604730

If Linux does that, I really hope it can be done in a standardized way that doesn't make porting to *BSD more difficult than it already might be. Standards are a good thing.

bawolff · 2026-04-30T12:51:31 1777553491

> A lack of adoption isn't apriori a good argument against an interface

I mean it kind of is (perhaps not a priori, but why is that relavent?). If something is not being used, its not meeting needs, so its just increasing attack surfaces without benefit.

eqvinox · 2026-04-30T02:09:16 1777514956

The primary benefit of AF_ALG is IMHO when it's combined with kernel keyrings, i.e. ALG_SET_KEY_BY_KEY_SERIAL.

To steal from the sibling post:

> * When using user space libraries, all key material and other cryptographic sensitive parameters remains in the calling application's memory even when the application supplied the information to the library. When using AF_ALG, the key material and other sensitive parameters are handed to the kernel. The calling application now can reliably erase that information [...]

It's even more than this: you can do crypto ops in user space without ever even having the key to begin with.

[Ed.: that said, maybe AF_ALG should be locked behind some CAP_*]

[Ed.#2: that said^2, I'm putting this one on authencesn, not AF_ALG. It's the extended sequence number juggling that went poorly, not AF_ALG at large. I bet this might even blow up in some strange hardware scenarios, "network packet on PCIe memory" or something like that - I'm speculating, though.]

ebiggers · 2026-04-30T03:04:01 1777518241

It doesn't seem to actually get used that way in practice. ALG_SET_KEY_BY_KEY_SERIAL didn't even appear until just a few years ago. And either way, if the interface allows you to overwrite the su binary, whether it theoretically could provide some other security benefit becomes kind of irrelevant.

eqvinox · 2026-04-30T03:12:22 1777518742

It is being used that way:

https://github.com/opensourcerouting/frr/blob/2b48e4f97fb021...

And, sure, if it breaks system security it's pointless. But so did "dirty pipe".

I do agree the number of issues in AF_ALG is annoying, which is why I suggested a CAP_* restriction. Maybe CAP_SYS_ADMIN in init_ns, that's kinda the big hammer.

angry_octet · 2026-04-30T02:12:18 1777515138

Better implemented as another user space process than in the kernel.

eqvinox · 2026-04-30T02:16:35 1777515395

You can't access TPMs that way.

angry_octet · 2026-04-30T03:28:23 1777519703

Most of the Linux kernel crypto is not touching the TPM. If there is a TPM task, only that code should be in kernel, and it should be accessed from user space by a process with the appropriate token.

eqvinox · 2026-04-30T03:50:03 1777521003

Yes, AF_ALG is exposing too many things, like authencesn, which has zero reason for being userspace accessible. It's a crypto mode specific to IPsec.

However,

> it should be accessed from user space by a process with the appropriate token.

That is AF_ALG. The operations it offers are what you need for full coverage. The issues with it are two:

- usage specific crypto in the kernel implements the same interfaces, and it doesn't have a filter for that, as mentioned above. It's not offering too many operations, it's offering too many algorithms.

- it's trying to be fast. I guess people also want to use crypto accelerators through it. (Which is kinda related to TPMs, there is accelerator hardware with built-in protected key storage...)

The CVE we're looking at here is in the intersection of both of these.

angry_octet · 2026-04-30T09:35:39 1777541739

All the uses of vmsplice etc are a bit tricky, and that points to the need for a better interface. But given you're using splice, why not do the crypto in user space? A belief that it is better to be fast and buggy than safe and slower?

eqvinox · 2026-04-30T10:55:07 1777546507

If neither a hardware component nor kernel key management is involved, crypto should be done in userspace, end of sentence.

The more I think about it, the more I think it should be behind CAP_SYS_ADMIN, or a new CAP_KCRYPT (better name TBD. CAP_CRYPT_OFFLOAD?)

angry_octet · 2026-04-30T12:24:21 1777551861

Yes it should definitely require a capability.

Still a risk that some admin-enabled method (like enabling an IPsec VPN) provides a path to it, but would reduce the potential for crafting weird inputs.

angry_octet · 2026-04-30T12:25:27 1777551927

I'm also wondering if it couldn't be rewritten to use io_uring interfaces.

eqvinox · 2026-05-01T01:21:17 1777598477

That's really orthogonal (and you can already do io_uring with AF_ALG, at the end of the day AF_ALG is just recvmsg() and sendmsg(), which work just fine in io_uring...)

angry_octet · 2026-05-03T15:00:54 1777820454

I mean for more efficient and easier to verify out-of-kernel implementations of crypto with kernel like speeds.

kasabali · 2026-04-30T06:00:13 1777528813

eqvinox · 2026-04-30T10:55:33 1777546533

Cheesecake

Now, is your comment contributing more to this discussion, or mine?

SeriousM · 2026-04-30T06:08:05 1777529285

I was completely unaware of https://syzbot.org, thanks for sharing!

> syzbot system continuously fuzzes main Linux kernel branches and automatically reports found bugs to kernel mailing lists. syzbot dashboard shows current statuses of bugs. All syzbot-reported bugs are also CCed to syzkaller-bugs mailing list. Direct all questions to syzkaller@googlegroups.com.

TZubiri · 2026-04-30T14:26:14 1777559174

YAGNI stocks are rising, Gentoo devs that compile their own kernel probably yeeted this module. Alpine, and MUSL deviants are probably immune to this downswing.

DRY looking very bearish, do repeat yourself, do build your own, do use userspace tools even if the kernel has its own version. Not as big a hit on the DRY philosophy as those pip and npm supply chain attacks last couple of weeks though.

KISS remains unaffected for the time being.

MarsIronPI · 2026-05-01T03:10:37 1777605037

I think the issue here is not "Don't Repeat Yourself", but "Don't Reinvent the Wheel". If your wheel is just a circle of wood, you're better off building it yourself than hiring a skilled (or sometimes not so skilled) laborer. Too much overhead and risk.

wolttam · 2026-04-30T14:40:00 1777560000

I love this. I think everyone in software should be feeling a tinge of “we should trim the fat” right now - get rid of as much of the old and infrequently used/tested code as we can. Push users towards the better tested alternatives.

zbentley · 2026-04-30T16:24:42 1777566282

But but but … wE dOnT bREaK uSErsPaCe!

dev_l1x_be · 2026-04-30T10:18:15 1777544295

Why is this available in the kernel on a box that does not use ipsec? should this be compile time enabled module instead than a generic solution?

ButlerianJihad · 2026-04-30T10:31:23 1777545083

The design philosophy of mainstream Linux distros is not like OpenBSD.

Linux distros go to market as maximally capable, maximally interoperable, and maximally available for whatever the users want to do. So there is a lot of "shovelware" that is unnecessarily installed with your base system. A lot of services are enabled that you don't need. A lot of kernel modules are loaded or ready to spring into action as soon as you connect hardware that the kernel recognizes.

All this maximizing also increases the system's attack surface, whether local or over the network. Your resources, time and effort increase, to update the system and maintain all those packages. The TCO is high.

With OpenBSD, the base system is hardened and the code is audited with security in mind. They only install or enable essential functions. So it's up to the user to dig in, customize it, and add in features that are needed.

The good news is that you can do some after-market hardening. Uninstall software that you're not using, and disable non-essential services. Tune your kernel for special-purpose, or general-purpose, but not every-purpose.

There are now special distros for containers and VMs with minimal system builds. They are designed to be as small and lightweight as possible. That is a good start in the right direction.

dev_l1x_be · 2026-04-30T12:34:56 1777552496

Thanks for the explanation. I am wondering if it is possible or does it make sense to have a modular linux that does not have these attack surfaces enabled by default. Alpine is my default solution for most Linux use cases (except when I need GPU support).

tosti · 2026-04-30T12:44:12 1777553052

Not "by default", but still Gentoo. My USE= is several lines worth of -this -that -all-the-things. I got rid of wayland, pipewire, pulseaudio, avahi and a shitload of other stuff I don't need.

PulseAudio applications can still produce (but not record) audio through apulse and my handcrafted asoundrc

tosti · 2026-04-30T05:38:35 1777527515

I think it would be reasonable to deprecate af_alg in favor of a character device. It's more accessible that way. The downside is that the maintainers hate adding new ioctls. I think that's fair. But I don't think a "regular" device node would cover the functionality userland expects.

That said, elsewhere ITT it's pointed out there are only a few use cases so far.

KnuthIsGod · 2026-04-30T05:25:54 1777526754

Removing this will make the friendly spooks at NSA very sad....

tosti · 2026-04-30T12:38:43 1777552723

No, it'd make me sad. If they're lurking in there and we can do without, I'm happy to always have my own .config

If this gets removed, they'll creep in somewhere we can't find them for a while.

Fr0styMatt88 · 2026-04-30T00:50:41 1777510241

How did it get in? Isn’t Linus known for being rightfully fussy about what makes it into the kernel?

Would be an interesting story.

kasabali · 2026-04-30T06:09:05 1777529345

Linus has had been fussy about maybe like 5% of the things because even then he couldn't keep up with the sheer volume. Nowadays it's more like 1‰

anabis · 2026-04-30T04:20:18 1777522818

Many things, such as ksmbd seems ill-advised when looked at from security. New AI driven exploits era will likely make projects more wary to adding functions.

WhyNotHugo · 2026-04-30T12:27:24 1777552044

iwd requires CONFIG_CRYPTO_USER_API_AEAD, so disabling this would break Wi-Fi for a lot of people.

ebiggers · 2026-04-30T22:48:01 1777589281

Indeed, iwd is the main reason why general-purpose Linux distros can't disable AF_ALG yet. But many Linux systems are more specialized and don't have wireless connectivity, or they use another wireless daemon such as wpa_supplicant which doesn't have this issue.

I'm hoping we can get iwd fixed to use a userspace crypto library, as well. This is something that people could help with.

iwd also runs as root, so it would be okay with a CAP_SYS_ADMIN permission check if one were introduced, I think.

sidewndr46 · 2026-04-30T00:38:26 1777509506

any idea what software this will break once I turn this kernel configuration off?

ebiggers · 2026-04-30T01:55:22 1777514122

iwd is the main culprit (for systems that use it instead of wpa_supplicant).

I think cryptsetup / LUKS also requires it with some non-default options. With the default options, it works fine with the kconfigs disabled.

There's not much else, as far as I know. Normally programs just use a userspace library instead, such as OpenSSL.

sharts · 2026-05-01T16:03:46 1777651426

If it should not exist what’s the rationale for keeping it instead of removing it?

400thecat · 2026-04-30T08:14:37 1777536877

can you please give me a real-life example of an application, on a typical linux laptop or typical linux server, which userspace application would use this CRYPTO_USER_API ? None that I looked at seem to use it: openssl, pgp, sha256sum

derefr · 2026-04-30T21:49:15 1777585755

It'd make a lot of sense to sandbox AF_ALG, then, wouldn't it? At least for userspace-driven invocations. Let the kernel keep its current code-path for kernel-driven invocations, but have the same code unit files also build some other sandboxed form, to be invoked by the crypto-accelerating syscalls.

If these syscalls are used by userspace as rarely as you say, the performance impact of this kind of sandboxing wouldn't matter much. And maybe there could be a KCONFIG/boot flag to switch back to using the un-sandboxed code path for userspace invocations too, for enterprises stuck with old software who really care.

---

My own thought process on how this could work below (but I'm not a kernel contributor, so you can probably immediately picture a design better than I can):

The naive way to do this, would be for the kernel build process to emit a separate AF_ALG userland IPC server as an additional build artifact; to get distros to package this IPC server as a component package of kernel packages; and to set up the sandboxed AF_ALG "kernel bridge" so that it proxies calls through to this IPC server if it exists, and errors out otherwise. (Basically like kfuse, except in this case the only "FUSE servers" are first-party.)

But that's a bit painful, organizationally. Puts a lot of work on the distro maintainers' shoulders, that they might just not bother doing. Prone to error. I think there are better alternatives.

1. Maybe the userland syscalls that rely on AF_ALG could instead ground out inside the kernel in a copy of AF_ALG that's been compiled to eBPF? Then that eBPF bytecode could just be embedded into the kernel.

2. Maybe the Linux kernel could consider a facility that would enable it to act as a hybrid microkernel (similar to macOS's XNU) — with arbitrary static sections of the kernel image/kernel modules [or perhaps standalone static ELF binaries embedded within kernel/kmod .data sections] being spawned not as supervisor-mode kthreads doing their own autonomous thing, but rather as unprivileged user-mode kernel threads, running as IPC-servers for the rest of the kernel to talk to?

- The rest of the kernel could talk to these "userspace kthreads" via some nonblocking IPC mechanism; but this mechanism wouldn't need to be exposed to userland the way macOS's XPC is; it could be kernel-to-kernel only (where these "userspace kthreads", despite being in userspace, are still fundamentally kernel threads, and so get to participate in it.)

- Also, these "userspace kthreads", when they're the active scheduled task, would have the kernel image's read-only sections [or their binary's sections, from within the kernel's .data section] mapped into their address space, since that's the binary they're executing against. But they wouldn't inherit [or the spawning mechanism would actively prune from their task struct] the rest of the kernel's mappings. So they'd have to either use the IPC mechanism, or use regular syscalls, to do anything with the kernel, just like any userspace task.)

ebiggers · 2026-04-30T22:51:53 1777589513

I don't see those eBPF or microkernel ideas as being particularly realistic! But there are some simple ways AF_ALG's attack surface could be reduced (as an intermediate step to disabling it entirely), like requiring CAP_SYS_ADMIN and/or limiting the algorithms to a specific list.

m3nu · 2026-04-30T11:21:16 1777548076

What other kernel modules would you suggest disabling that aren't used usually?

amluto · 2026-05-01T15:04:51 1777647891

IIRC some versions of cryptsetup require access to these APIs.

It doesn’t help that the historically dominant userspace implementation of most of this stuff was OpenSSL, which is also terrible.

ebiggers · on Sept 27, 2023

This is a great example of why so few people want to be a Linux kernel maintainer. Not only is it largely a thankless "job" where you get blamed for issues you didn't cause and are expected to do much of the work on your own time, but you can potentially get a misleading hit piece published about you and posted to Hacker News just for doing your job.

IMO, what the maintainer did (taking authorship and crediting the original author/reporter via Reported-by, after rewriting the entire patch including the commit message) was in line with kernel conventions. The lines are a bit blurry, and I think keeping the original author/reporter as at least a Co-developer would also have been acceptable. Still, people sometimes complain if they are kept as the author or co-developer if their patch is rewritten, as they don't want to "own" that rewritten patch and take blame for any issues in it. So pick your poison.

Ideally, more time would have been taken to work with the original author/reporter to get their patch in shape. Unfortunately, there isn't always time for that. In this case, the bug was reported to security@kernel.org as a security vulnerability, so that throws much of the usual process out the window; it needed to be fixed quickly. The maintainer went out of their way to get it fixed quickly, in a better way, and even added unit tests for it later. The original author/reporter was credited in both the fix commit and the pull request merge commit. Also note that the maintainer's commit is dated June 7 and was merged into mainline on June 9. So AFAICS, it predated the original author/reporter sending a revised patch; it postdated only the first patch.

timeon · on Sept 27, 2023

I agree in general (also things have tendency to escalate to unnecessary drama through re-posts and comments on the social networks) but:

> as a security vulnerability, so that throws much of the usual process out the window; it needed to be fixed quickly

Was not the bug reported originally six years ago?

ebiggers · on Sept 27, 2023

It may have been, but security impact is often not recognized right away. The older report was not sent to security@kernel.org and did not include a root cause analysis.

ebiggers · on Aug 30, 2023

How were you comparing ISA-L and libdeflate? For decompression I've found that the latest version of libdeflate is slightly faster than ISA-L.

mxmlnkn · on Aug 30, 2023

My extended benchmarks [0] use the rapidgzip, igzip, gzip, and pigz command line utilities and simply redirect the output to /dev/null to minimize I/O write interference. That's where I got the comparison of igzip to "zlib" as it is used in pigz. I did not compare libdeflate very often to these benchmarks, but before posting my comment, I quickly ran "time libdeflate-gzip -f -k -d 4GiB-base64.gz" inside /dev/shm, which took 20s vs. 9s for igzip.

libdeflate-gzip is something I built and installed a while ago from libdeflate/programs/gzip.c. lideflate-gzip -V prints: "gzip compression program v1.18. Copyright 2016 Eric Biggers". I am aware that lots of care also has to be taken with I/O, which might make the command line utility slower than the library interface, but doing the tests in /dev/shm hopefully alleviated this. I am also aware that base64-encoded random data is a weird test case but it has its pros because it is a kind of minimal benchmark for raw Huffman decoding speed without (many) LZ references that need to be resolved.

I redid the benchmark as outlined above with the three test files that I am also using for my extended benchmarks [0]:

    4GiB-base64.gz            -> libdeflate: 20.5 s, igzip: 9.4 s, rapidgzip: 1.5 s
    20xsilesia.tar.gz         -> libdeflate:  5.4 s, igzip: 6.6 s, rapidgzip: 1.8 s
    10xSRR22403185_2.fastq.gz -> libdeflate:  5.8 s, igzip: 5.5 s, rapidgzip: 1.9 s

File Sizes: Compressed -> Uncompressed:

    4GiB-base64.gz            : 4294967296 -> 3263906203
    20xsilesia.tar.gz         : 1364776140 -> 4239155200
    10xSRR22403185_2.fastq.gz :  970458140 -> 3618153020

In conclusion, it seems that it highly depends on the test case and the one I tested to, too quickly, check my statement is one of the outliers.

[0] https://github.com/mxmlnkn/rapidgzip#scaling-benchmarks-on-2...

mxmlnkn · on Aug 30, 2023

I also did benchmarks with zlib and libarchivemount via their library interface here [0]. It has been a while that I have run them, so I forgot. Unfortunately, I did not add libdeflate. I did not even add ISA-l. At that point, I was already would have been glad if my custom-written gzip decompressor could match the speed of the gzip command line utility, which for some weird reason is half as fast as zlib.

[0] https://github.com/mxmlnkn/rapidgzip/blob/master/src/benchma...

ebiggers · on Aug 8, 2022

Note that libdeflate has used essentially the same method since 2016 (https://github.com/ebiggers/libdeflate/blob/v0.4/lib/adler32...), though I recently switched it to use a slightly different method (https://github.com/ebiggers/libdeflate/blob/v1.12/lib/x86/ad...) that performs more consistently across different families of x86 CPUs.

ebiggers · on June 24, 2020

> In this somewhat unusual case, the problem was found and corrected in the Linux kernel through a typical bug-fix process and not handled as a security vulnerability, so no CVE was assigned.

This isn't unusual. This is actually the usual case.

The unusual thing here is actually that someone downstream noticed they were missing a fix (probably because its LTP regression test was failing, which is also unusual because most kernel fixes don't have an LTP regression test).

More commonly, no one notices and these bugs never get fixed in downstream kernels that aren't staying up to-date with LTS; and there is never a CVE, an oss-security post, a Hacker News thread, etc. But these bugs are still there.

bonzini · on June 24, 2020

> More commonly, no one notices and these bugs never get fixed in downstream kernels that aren't staying up to-date with LTS

Actually a lot of RHEL subsystems are updated wholesale by including all upstream patches (not just those that go into LTS kernels), and this way all such fixes are automatically included.

ebiggers · on Oct 6, 2019

Well, the longer that vulnerabilities are kept secret, the longer that users are unable to take any action to protect themselves, and the less incentive that vendors have to roll out fixes quickly and to prevent vulnerabilities in the first place. See the Project Zero disclosure FAQ: https://googleprojectzero.blogspot.com/p/vulnerability-discl...

ebiggers · on Oct 6, 2019

SELinux actually does significantly reduce the kernel attack surface on Android, and it has made a lot of kernel vulnerabilites unexploitable on Android. This particular bug was simply one in the remaining attack surface.

ebiggers · on Oct 6, 2019

syzbot is already fuzzing the latest two stable kernels and has found hundreds of bugs, including lots of use-after-frees. All these bugs are listed here:

- https://syzkaller.appspot.com/linux-4.14 - https://syzkaller.appspot.com/linux-4.19

As far as I know, no one is doing anything with the syzbot bugs against stable kernels directly, since no company using Linux is paying anyone to do it as their job. But some are getting fixed; e.g., some get reported against mainline too, then fixed and backported.

panpanna · on Oct 6, 2019

How complex it is to test all known issues against all current kernels?

A weekly report with some easy to understand graphs would probably convince more people to work on these bugs.

jsjohnst · on Oct 6, 2019

Time and cost, same as it would be to do it across all kernel versions, not just current ones. Theoretically could be done pretty simply via a CI/CD pipeline if someone wrote solid test cases for the issues found by the fuzzier.

panpanna · on Oct 7, 2019

Thats my point! Make this part of the kernel regression and also run it on the old kernels.

ebiggers · on Feb 20, 2019

AFAICS, this was exposed by the addition of sockfs_setattr() in v4.10. So it's incorrect to claim that kernels older than that are vulnerable, even though the code being fixed was older.

Also, note that there may not actually be a proof-of-concept exploit yet, beyond a reproducer causing a KASAN splat. When people request a CVE for a use-after-free bug they usually just assume that code execution may be possible. (Exploits can be very creative.)

ebiggers · on Feb 8, 2019

(I'm one of the authors of the blog post)

We considered it, of course, along with many other block ciphers. However, heavily optimized Threefish-256 is 22.6 cycles per byte on Cortex-A7 (by far the most common CPU this is needed on) which is over twice as slow as Adiantum. Threefish-512 and Threefish-1024 would be much slower still. We're already at the borderline of the performance needed to actually get all Android devices encrypted, so over 2x worse performance is a no-go.

Threefish also wasn't published as a standalone block cipher but rather was part of Skein, which lost the SHA-3 competition. Therefore it hasn't received as much cryptanalysis as ChaCha and AES, and probably won't get much more in the future.

Finally, note that unlike Adiantum, Threefish isn't a wide-block cipher, where flipping one bit in the sector scrambles all other bits. So comparing its complexity directly to Adiantum's is somewhat unfair. Other wide-block modes such as HCH and HCTR are also more complex than narrow-block modes.

fpgaminer · on Feb 8, 2019

Thank you for the additional insight!