What If We Build Agents Like a dbt Project

I came to agent development from data engineering, and once again, while putting together a typical structure on LangGraph, I started missing the declarative approach that many know well from dbt — where you describe what you want to do with the data, not how. And then a thought hit me: why not build my own agent framework that offers the exact same approach?

Plus, my least favorite thing about multi-agent systems on LangGraph is the mutable State. It quickly turns into an uncontrollable dumpster fire in RAM; you have to carefully update it every single time, and when something doesn’t work the way you want — you’re left printing things out and hunting for bugs. Maybe I’m just doing something wrong, but for me, these things always mean wasting extra mental energy.

Looking for a solution to this problem, I discovered the application of event-driven architecture in multi-agent systems, and later — event-sourced architecture. My main source of knowledge for this was a recent paper “ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering” (Brito dos Santos Filho, 2026). Ultimately, I wrote my own framework — zymi.

But first things first.

The framework is written in Rust — I don’t think there’s any need to repeat its perks. Let me put it out there right away: the development was done using Claude Code. You can feel however you want about vibecoding and code generation, but during development, I realized one very important thing: if you want to build something relevant today, you just don’t have any other options. I started working on the project right at the peak of the OpenClaw hype, and right now I’m writing this article while simultaneously giving Claude Code commands from my iPhone. If I had to write it all myself, I’d probably finish it when free AGI arrives (so, never).

About naming

Because of all this speed around, the name of the project happened — initially it was zumi, a nod to dog zoomies, when a dog suddenly starts tearing around the apartment non-stop. But about two weeks in, an app with an empathetic AI pet launched with the exact same name. I decided to change one letter to avoid confusion.

Declarative approach🔗

The whole beauty of the dbt is exactly that the engine completely takes over all the questions of how to do what you want. You declare your desire — draft a yaml, write some SQL, and dbt runs all the necessary transformations itself. I decided to go down the exact same path.

Installing zymi:

pip install zymi-core

And initializing the built-in research agent project example:

zymi init --example research

The default project looks like this:

zymi-research/
├── .zymi/
├── agents/
│  ├── researcher.yml
│  └── writer.yml
├── memory/
├── output/
├── pipelines/
│  └── research.yml
├── tools/
└── project.yml

This pipeline consists of just two agents — a researcher and a writer. Here’s how they are configured:

name: researcher
description: "Research agent — searches the web, scrapes pages, and stores findings in memory"
model: ${default_model}
system_prompt: |
  You are a thorough research assistant. Your job is to find accurate,
  up-to-date information on the given topic.

  Strategy:
  1. Start with broad web searches to identify key sources.
  2. Scrape the most promising pages for detailed content.
  3. Store important findings in memory with clear keys.
  4. Cite your sources.

  Always prefer primary sources over secondary ones.
  If information conflicts, note the discrepancy.
tools:
  - web_search
  - web_scrape
  - write_memory
max_iterations: 15

name: writer
description: "Writer agent — reads research findings and produces a structured report"
model: ${default_model}
system_prompt: |
  You are a skilled technical writer. Your job is to transform raw research
  findings into a clear, well-structured report.

  Guidelines:
  1. Read all available memory entries to understand the research.
  2. Organize findings into logical sections.
  3. Include a summary at the top.
  4. Cite sources where available.
  5. Write the final report to the output directory.

  Format: Markdown. Be concise but thorough.
tools:
  - read_file
  - write_file
max_iterations: 10

Here is how a web search tool is configured, using Tavily for example:

name: web_search
description: "Search the web for information on a given query"
parameters:
  type: object
  properties:
    query:
      type: string
      description: "Search query"
  required: [query]

implementation:
  kind: http
  method: POST
  url: "https://api.tavily.com/search"
  headers:
    Content-Type: "application/json"
    Authorization: "Bearer ${env.TAVILY_API_KEY}"
  body_template: '{"query": "${args.query}", "max_results": 5}'

After that, you can assemble the agents into a single pipeline:

name: research
description: "Multi-step research pipeline: parallel search → analysis → report"

steps:
  - id: search_web
    agent: researcher
    task: "Search the web for information about: ${inputs.topic}"

  - id: search_deep
    agent: researcher
    task: "Find in-depth articles and technical details about: ${inputs.topic}"

  - id: analyze
    agent: researcher
    task: >
      Analyze and cross-reference all findings from the web search and deep
      search. Identify key themes, contradictions, and gaps. Store a structured
      summary in memory under the key 'analysis'.
    depends_on:
      - search_web
      - search_deep

  - id: write_report
    agent: writer
    task: >
      Read the analysis from memory and write a comprehensive research report
      to ./output/report.md. Include an executive summary, main findings,
      and a sources section.
    depends_on:
      - analyze

input:
  type: text

output:
  step: write_report

And that’s basically it — a simple multi-agent system is ready to go. We can run it right away from the console:

❯ zymi run research -i topic="Event sourcing in AI"
Pipeline: research
  Multi-step research pipeline: parallel search → analysis → report

  Execution plan: 4 steps, 3 levels

  Level 1 (parallel): search_deep, search_web
    [search_deep] done (2 iterations)
    [search_web] done (3 iterations)

  Level 2: analyze
    [analyze] done (2 iterations)

  Level 3: write_report

--- approval required ----------------------------
  File write outside allowed directories: output/report.md
  approve? [y/N]: y
    [write_report] done (2 iterations)

---
Pipeline completed successfully.

Final output:
The comprehensive research report on Event Sourced Architecture in Multi-Agent Systems has been successfully written to `./output/report.md`. It includes an executive summary, detailed findings on key themes, potential contradictions, and areas for further exploration, as well as a section on sources.

Besides the convenience and speed — which is subjective, of course — this gives a much more important advantage for the modern world: generative models are great at generating code, but generating yaml, especially strictly following a JSON schema, is an order of magnitude easier for them. I’m planning an experiment that I’ll write about in my next article: I’ll ask Claude/Codex to generate the exact same pipeline on LangGraph and on zymi, and run them. I bet zymi will take fewer iterations and tokens.

But the declarative config is just the external interface. Under the hood, zymi also differs from LangGraph — instead of a shared state that is mutated by agents, all communication is built on a single data bus.

Event sourced architecture🔗

In zymi, every event is an immutable record in the DB with hash-chain verification. All interactions are built on these events — records in the data bus. All connectors, agents, and tools write to and read from this bus; it is the single source of truth in the system. As a result, we get not just a log, but a cryptographically linked chain of all the agent’s actions. This makes every run traceable and reproducible right out of the box.

Why do we need cryptography here? We can draw an analogy with dbt again: the event log gives us lineage — dbt docs; intentions and verification give us confidence in the correctness of this log — dbt test.

Thanks to this, we always know not only what an agent did, but why — instead of mutating something directly, they express an intention to do so. This intention is also written to the data bus, where the monitor picks it up — and based on its requirements, decides if it can be executed or if it needs to ask the user. At the very end, in events 49-50, you can see how the monitor hesitated about allowing a write to a directory outside the allowed ones, and called for a human.

Viewing the run log:

zymi events --verbose

Which spits out this massive log

Stream 'pipeline-research-04e6e68e-622c-4233-8a2d-2051fd132b11': 50 event(s)

#1    10:50:51.317 workflow_started source=engine
  4 node(s) — pipeline: research
  #2    10:50:51.320 workflow_node_started source=engine
    search_deep — agent=researcher, task="Find in-depth articles and technical details about: Even
  #3    10:50:51.320 workflow_node_started source=engine
    search_web — agent=researcher, task="Search the web for information about: Event sourcing in
    #4    10:50:52.806 intention_emitted source=orchestrator
      call_custom_tool data={"type":"CallCustomTool","data":{"tool_name":"web_search","arguments":"{\"query\":\"Event sourcing in AI\"}"}}
    #5    10:50:52.806 intention_evaluated source=orchestrator
      call_custom_tool -> approved
    #6    10:50:52.981 intention_emitted source=orchestrator
      call_custom_tool data={"type":"CallCustomTool","data":{"tool_name":"web_search","arguments":"{\"query\":\"Event sourcing in AI\"}"}}
    #7    10:50:52.982 intention_evaluated source=orchestrator
      call_custom_tool -> approved
    #8    10:50:56.138 intention_emitted source=orchestrator
      call_custom_tool data={"type":"CallCustomTool","data":{"tool_name":"web_scrape","arguments":"{\"url\": \"https://www.eventsourcing.ai/\"}"}}
    #9    10:50:56.139 intention_evaluated source=orchestrator
      call_custom_tool -> approved
    #10   10:50:57.055 intention_emitted source=orchestrator
      call_custom_tool data={"type":"CallCustomTool","data":{"tool_name":"web_scrape","arguments":"{\"url\": \"https://www.eventsourcing.ai/\"}"}}
    #11   10:50:57.055 intention_evaluated source=orchestrator
      call_custom_tool -> approved
    #12   10:50:58.860 intention_emitted source=orchestrator
      call_custom_tool data={"type":"CallCustomTool","data":{"tool_name":"web_scrape","arguments":"{\"url\": \"https://infodation.com/en/blogs/event
    #13   10:50:58.861 intention_evaluated source=orchestrator
      call_custom_tool -> approved
    #14   10:50:59.261 intention_emitted source=orchestrator
      call_custom_tool data={"type":"CallCustomTool","data":{"tool_name":"web_scrape","arguments":"{\"url\": \"https://infodation.com/en/blogs/event
    #15   10:50:59.262 intention_evaluated source=orchestrator
      call_custom_tool -> approved
    #16   10:51:03.028 intention_emitted source=orchestrator
      call_custom_tool data={"type":"CallCustomTool","data":{"tool_name":"web_scrape","arguments":"{\"url\": \"https://artium.ai/insights/event-sour
    #17   10:51:03.031 intention_evaluated source=orchestrator
      call_custom_tool -> approved
    #18   10:51:03.105 intention_emitted source=orchestrator
      call_custom_tool data={"type":"CallCustomTool","data":{"tool_name":"web_scrape","arguments":"{\"url\": \"https://artium.ai/insights/event-sour
    #19   10:51:03.106 intention_evaluated source=orchestrator
      call_custom_tool -> approved
    #20   10:51:05.744 intention_emitted source=orchestrator
      call_custom_tool data={"type":"CallCustomTool","data":{"tool_name":"web_scrape","arguments":"{\"url\": \"https://akka.io/blog/event-sourcing-t
    #21   10:51:05.745 intention_evaluated source=orchestrator
      call_custom_tool -> approved
    #22   10:51:16.668 intention_emitted source=orchestrator
      write_memory data={"type":"WriteMemory","data":{"key":"event_sourcing_ai_overview","content":"Event sourcing in AI involves using event-dr
    #23   10:51:16.669 intention_evaluated source=orchestrator
      write_memory -> approved
  #24   10:51:16.669 memory_written source=agent
    event_sourcing_ai_overview = Event sourcing in AI involves using event-driven architecture to continuously ca
    #25   10:51:16.671 intention_emitted source=orchestrator
      write_memory data={"type":"WriteMemory","data":{"key":"event_sourcing_ai_benefits_challenges","content":"The integration of AI into event
    #26   10:51:16.672 intention_evaluated source=orchestrator
      write_memory -> approved
  #27   10:51:16.674 memory_written source=agent
    event_sourcing_ai_benefits_challenges = The integration of AI into event sourcing enhances decision-making, predictive a
    #28   10:51:16.675 intention_emitted source=orchestrator
      write_memory data={"type":"WriteMemory","data":{"key":"event_sourcing_key_components","content":"Event sourcing is a data storage pattern
    #29   10:51:16.676 intention_evaluated source=orchestrator
      write_memory -> approved
  #30   10:51:16.676 memory_written source=agent
    event_sourcing_key_components = Event sourcing is a data storage pattern where every change is recorded as an ev
    #31   10:51:16.678 intention_emitted source=orchestrator
      write_memory data={"type":"WriteMemory","data":{"key":"event_sourcing_vs_cqrs","content":"Event sourcing is often confused with Command Qu
    #32   10:51:16.679 intention_evaluated source=orchestrator
      write_memory -> approved
  #33   10:51:16.679 memory_written source=agent
    event_sourcing_vs_cqrs = Event sourcing is often confused with Command Query Responsibility Segregation (
    #34   10:51:16.684 intention_emitted source=orchestrator
      write_memory data={"type":"WriteMemory","data":{"key":"event_sourcing_agentic_ai","content":"Event sourcing provides a strong foundation f
    #35   10:51:16.686 intention_evaluated source=orchestrator
      write_memory -> approved
  #36   10:51:16.688 memory_written source=agent
    event_sourcing_agentic_ai = Event sourcing provides a strong foundation for agentic AI systems by supporting
    #37   10:51:16.690 intention_emitted source=orchestrator
      write_memory data={"type":"WriteMemory","data":{"key":"event_sourcing_ai_sources","content":"Sources for event sourcing in AI include Even
    #38   10:51:16.690 intention_evaluated source=orchestrator
      write_memory -> approved
  #39   10:51:16.691 memory_written source=agent
    event_sourcing_ai_sources = Sources for event sourcing in AI include EventSourcing.ai, Infodation, Artium.ai
  #40   10:51:21.170 workflow_node_completed source=engine
    ok search_web
  #41   10:51:24.346 workflow_node_completed source=engine
    ok search_deep
  #42   10:51:24.346 workflow_node_started source=engine
    analyze — agent=researcher, task="Analyze and cross-reference all findings from the web se
    #43   10:51:29.794 intention_emitted source=orchestrator
      write_memory data={"type":"WriteMemory","data":{"key":"analysis","content":"### Key Themes\n1. **Traceability and Reproducibility**: Event
    #44   10:51:29.794 intention_evaluated source=orchestrator
      write_memory -> approved
  #45   10:51:29.795 memory_written source=agent
    analysis = ### Key Themes
1. **Traceability and Reproducibility**: Event sourcing allows fo
  #46   10:51:30.901 workflow_node_completed source=engine
    ok analyze
  #47   10:51:30.903 workflow_node_started source=engine
    write_report — agent=writer, task="Read the analysis from memory and write a comprehensive rese
    #48   10:51:37.515 intention_emitted source=orchestrator
      write_file data={"type":"WriteFile","data":{"path":"output/report.md","content":"# Research Report on Event Sourcing in AI\n\n## Executi
    #49   10:51:37.516 intention_evaluated source=orchestrator
      write_file -> requires_approval: File write outside allowed directories: output/report.md
    #50   10:51:37.517 approval_requested source=orchestrator
      ⏳ waiting id=9869501e-141a-48bb-84a8-37f1a271b42f File write outside allowed directories: output/report.md

Conclusion🔗

Of course, zymi is still raw — it’s just an alpha, but it’s already ready to help prototype and experiment with agents. The project has a big backlog:

Migration to libsql — vector memory, native async, and edge replicas
Polishing the ability to connect PostgreSQL as a data bus
Implementing the ability to add Python tools in a declarative mode
LLM response streaming

And a lot more.

I’m really curious to see if this data engineering approach will resonate in the modern world of AI agents. I’ll be glad to hear any feedback — drop into the project repository (drop some stars, of course).

And may immutability be with you!

This article was originally published in Russian on Habr