More

csbartus · 2026-06-14T05:37:22 1781415442

In my 30+ years of SWE/SWA career this is the first time I can harvest the benefits of a well defined and exactly implemented architecture.

Thanks to LLMs.

Before LLMs even if the architecture principles were simple and clear, distilled into templates + codegens added for boilerplate / skeleton generation ... It was impossible to follow them on the long run. Devs tried their best, but on the long run everything eroded and there were no resources for refactoring.

Now, with coding agents, I was able to create a production grade app following a similar architecture to Presentation Domain Data Layering, from this article.

Now the codebase is 100% uniform both in content (code) and structure (files and folders). It's like being written by a single person. Finding a specific file takes a second with no cognitive load. Editing a file is straightforward since every file follows a specific template.

LLMs have benefits and drawbacks, and in this case their help is enormous.

simonw · 2026-06-14T05:43:29 1781415809

Something I recently realized is that the fastest and easiest way to use coding agents is if you apply them to problems where there is just one, obvious solution.

This absolutely relates to architecture. If your system is designed such that any given feature fits in an obvious place, using obvious patterns, with obvious ways to test it... 90% of the time a coding agent will be able to do exactly the right thing from a single, short prompt.

This also makes code review so much less taxing - if the solution is obvious, reviewing and checking that the agent followed that obvious path takes much less time than if you're trying to untangle something a lot more complicated.

csbartus · 2026-06-14T05:48:32 1781416112

Exactly! What I use is a main workflow document where I embed at every step pointers to architecture and templates.

My prompt is ... "We are implementing the X feature. We are at step 6. Plan first"

Then the agent spits out identical plans then identical code for every feature.

amosjyng · 2026-06-14T06:23:10 1781418190

I am so curious as to how you make this happen.

1. How do you organize your architecture files so that agents know where to find and update architectural info? E.g. everything in one big file, or sharded per module/subsystem with an AGENTS.md for discoverability, or something else?

2. What gets templated? What do your template files contain or look like?

3. How do you get the LLMs to actually slot something into the right place? E.g. a problem I repeatedly run into is the LLM weakening abstraction boundaries. I have to explicitly tell it things such as "No, this is obviously a UI-specific endpoint that belongs on the BFF rather than on the business logic focused backend API." Of course it gets better as I add more examples and rules each time I catch something dumb, but it sounds like you're avoiding this problem altogether with good architecture. How are you doing that?

4. It sounds like you have some sort of workflow that is standardized yet still generalizable enough to cover the generic case of new feature development on the system. How is that possible? What can you share about this flow?

speedbird · 2026-06-14T10:06:12 1781431572

You can use a well known template like ARC42 and insist on the maintenance of suitable indexes for agent access and review for coherence on a regular basis.

Ultimately any useful documentation needs active curation.

Getting devs to do this well is hard. Agents can be driven to do a fair job through suitable prompt frameworks and repetition eg as PR reviewers.

I’ve had reasonable success with a combination of space dimensions structured requirements and component based indexes and time dimensions of decision records and commits.

Also a strong design model to give common structure - hex arch with ports and adapters and pure business logic.

YMMV

csbartus · 2026-06-06T05:45:30 1780724730

This specify-encode-fulfill loop/method is effective to make agents create bug-free code.

In my version of this workflow I do specify myself, then let the LLM do the rest.

This way 1.) I'm 100% sure the understanding/spec is good 2.) It's translated into an executable format so the implementation can be verified 3.) The implementation has maximum code coverage tests which steers the AI to produce code which follows standards, fits into the existing codebase, and it's very easy to refactor.

So far, this is the one and only advantage of using LLMs in my SWE practice. They glue together (human written) specs with code, with confidence, in no time.

csbartus · 2026-06-06T05:23:00 1780723380

The question is whether AI / LLMs gets better.

I'm not an ML expert, but regarding code _quality_ I see no progress at all in the last couple of years. LLMs still write code by using probabilistic calculations vs. applying rigorous thinking and logic.

This is only good while no one has to look under the hood. When trying to understand and fix code written by LLMs you'll realize what a mess they produce. It's a codebase without any systematic thinking inside. Everything is ad-hoc, wired together to pass the tests, and to conform to some templates. No deliberate practice, no intelligence at all in the code.

This can't be a long term strategy for an entire industry.

Animats · 2026-06-06T11:04:39 1780743879

We're going to need some mid-level representation of what software is trying to do. Formal specs? UML? Semi-formal specs in natural language? Design rules? People hate updating such representations, but AIs don't have that problem.

csbartus · 2026-06-06T05:01:59 1780722119

It's a gut feeling.

We _know_ LLMs can't be _that_ good as they are promoted.

I've spent the last 6 months creating a production grade app from scratch with Claude where I wrote no single line of code. I've reviewed code and it was looking good, almost completely following my templates, workflows, skills.

Now I've started to make minor manual updates and I'm horrified. Claude has no idea why there were those templates and instructions in place. It followed them blindly without grasping their spirit. The end result is like a very junior dev copy-pasting answers from Stack Overflow into the codebase. No consistency, chaotic application of different conventions, duplicated code, ghost code (does nothing), and perhaps more as I'm digging in.

The pros: The code works, all tests pass (43% code / 57% tests, 1:1.3 ratio), the UI looks good with visible glitches

The cons: I'll have to rewrite most of the code on the long run, make it fit, easy to maintain.

The verdict: I wouldn't started this project alone. Claude get me through to v0.1.0 / MVP where I've focused solely on the product: technologies, architecture, functionality, and usability. Now it's easier to refactor all for v0.2.0 manually without Claude.

So this might be our gut feeling: we know it's something good, but not as good as the stakeholders might promote. We know it helps in some ways but it's a nightmare in other ways.

We are not anti-AI but rather pragmatic: Not that AI enthusiasts we are expected to be.

hn_throw2025 · 2026-06-06T08:42:13 1780735333

> No consistency, chaotic application of different conventions, duplicated code, ghost code (does nothing), and perhaps more as I'm digging in.

I didn’t understand this part. You said you reviewed the code and it was looking good, so how did the cruft creep in? Were you reviewing every diff, or taking an occasional sample?

csbartus · 2026-06-14T05:20:11 1781414411

Good point! Reviewing code, in the AI era, in my practice, means skimming code and looking for patterns.

I use templates / conventions and make the AI generate code using them. When reviewing code I'm scanning if a file uses a specific template and follows specific conventions.

This can't catch subtle errors like a function is re-created vs is re-used (duplicate code), unnecessary code (bloat), inconsistent naming (a Button component has a cssRow() styling function associated vs cssButton()).

When you start editing, using code these little things add up, consume your attention, drain you up, giving no flow and resulting in minimal productivity.

Izkata · 2026-06-08T00:59:49 1780880389

Reviewing is a very different mindset than writing it yourself. You don't have all the context you would have built up had you done it, and it's much much more difficult to think through all cases. So I'm thinking: The individual changes all looked good in isolation, and they started borderline rubber-stamping the changes without stepping back to think about the larger context.

Looking at the individual changes in isolation, it's harder to see it doesn't match other conventions, duplicates code, removes or disables paths without cleaning up, etc. I'll bet there's also some crazy spaghetti code in there, from helping a co-worker clean up their Ai-generated code that they didn't understand.

csbartus · on June 3, 2025

SEEKING WORK | Europe | Remote

I wear multiple hats: Researcher / Senior Software Architect / React Lead / Design Engineer / Junior AI Engineer.

I can help you with your AI software engineering strategy and implementation with a focus on correctness: https://www.osequi.com/studies/list/list.html

- Résumé/CV: https://osequi.com/

- Email: bartus.csongor@gmail.com

csbartus · on June 3, 2025

Senior Software Architect | React Lead | Design Engineer | Remote, EU

- Location: Europe

- Remote: Yes

- Willing to relocate: Maybe

- Résumé/CV: https://osequi.com/, https://chat.osequi.com/ (AMA with AI)

- Email: bartus.csongor@gmail.com

I deliver better software, faster:

- I solve the two top pain points of JS / TS / React [1]

- Using lightweight formal and semi-formal methods [2]

- I integrate multiple disciplines (entrepreneurship, math, computer science, UX/UI design) to create better products [3]

I'm interested in companies with a high societal impact: synthetic biology, education, healthcare, personal growth, financial stability.

[1]: https://2023.stateofjs.com/en-US/usage/#top_js_pain_points

[2]: https://www.osequi.com/studies/list/list.html

[3]: https://www.osequi.com/csongor-bartus-profile.pdf

csbartus · on May 5, 2025

SEEKING WORK | Europe | Remote

I wear multiple hats: Researcher / Senior Software Architect / React Lead / Design Engineer / Junior AI Engineer.

I can help you with your AI software engineering strategy and implementation with a focus on correctness: https://www.osequi.com/studies/list/list.html

- Résumé/CV: https://osequi.com/

- Email: bartus.csongor@gmail.com

csbartus · on May 5, 2025

Senior Software Architect | React Lead | Design Engineer | Remote, EU

- Location: Europe

- Remote: Yes

- Willing to relocate: Maybe

- Résumé/CV: https://osequi.com/, https://chat.osequi.com/ (AMA with AI)

- Email: bartus.csongor@gmail.com

I deliver better software, faster:

- I've created a novel methodology to produce likely-correct software using formal and semi-formal methods [1]

- I integrate multiple disciplines (entrepreneurship, math, computer science, UX/UI design) to create better products [2]

- I use LLMs to generate software based on mathematically correct diagrams [3]

I'm interested in companies with a societal impact: improving lives through education, healthcare, personal growth, financial stability.

[1]: https://www.osequi.com/studies/list/list.html

[2]: https://www.osequi.com/csongor-bartus-profile.pdf

[3]: https://tonsky.me/blog/diagrams/

csbartus · on April 29, 2025

I'm writing a study about how to write likely-correct studies ... :)

This is a second part of a series on likely-correctness, the first is how to create likely-correct software: https://www.osequi.com/studies/list/list.html

csbartus · on April 29, 2025

Here is a birds-eye view of programming (classic, functional, quantum) vs category theory vs logic -- aka the computational trilogy:

https://ncatlab.org/nlab/show/computational+trilogy

It helped me a lot putting into context my existing programming knowledge while learning category theory