All case studies
Laboratory · live experimentPersonal finance · agentic-first2026, in development
Case Study #02

BR-Budget.
Software I
did not build for you.

An experiment where I test a hypothesis: what happens when the primary user of a finance application stops being a human and becomes an AI agent acting on their behalf. I do not know if I am right. I am building to find out.

Agentic-first
The main user is an agent, not a human
API-first
The agent knows every endpoint
BR
BR-Budget Agentonline
Log a PLN 45 expense at Lidl, food19:42 ✓✓
Saved.
In May: PLN 1,273 on food, 87% of the monthly limit.19:42
Can I afford a lawn mower for PLN 1,500?19:43 ✓✓
Message...
BR-Budget, transaction list aktualizowana po akcji agenta+ Lidl · -PLN 45.00 · food
BR-Budget · agent ↔ UITelegram → REST API → production
Try BR-BudgetRead the manifesto10 accounts for free · no guarantees
~50
API endpoints
CRUD, agent scope and bootstrap.
2 auth scopes
Human i agent
Clerk session for the UI, bearer for /api/agent/*.
176
Transactions
From my real usage. This is not hello world.
6
Accounts reconciled
ING, Nest Bank, cash, IKE, VAT and current account.
IHypothesis

What if the main user
of the application is not a human?

Future software will not be applications for people. It will be a set of data and hard domain rules for an LLM agent standing between the human and the system.
Hypothesis I am testing · 2026

I call this agentic-first. I do not want a human to click around for data if an agent can fetch it, understand it and return with a decision. The UI is still needed, but it stops being the place of daily work.

We have a precedent. Mobile-first also sounded provocative twenty years ago. First you designed for desktop and mobile was an add-on. Mobile-first reversed that logic.

Agentic-first is a similar shift, only deeper inside the product. It is not about bolting an endpoint onto an existing dashboard. It is about designing data, rules, permissions and decision history so an agent can really work with them.

Mobile-first · yesterday
Human
UIscreen
Productdata + rules

The human taps, clicks and reads. The app must be beautiful, clear and fast.

Agentic-first · tomorrow?
Human
AgentLLM + tools
APIUI in the background

The human talks to the agent. The agent reads the API. The product must be machine-readable.

The human tells the agent: “check whether I can afford a new lawn mower”. The agent connects to the API, pulls the balance, categorizes expenses, checks obligations and returns an answer. The human never opens the dashboard.

The LLM does not know where I eat out, what subscriptions I have or how I account for VAT. These are data and rules living in the application. The application must be readable for the agent.

It sounds provocative. I may be wrong. That is why I decided to test it in practice.

Finance apps are great at measuring expenses. They do not help you save. Those are two different things.
Second observationAfter a decade with YNAB
IIExperiment

API-first, agent inside.

Why a budget app.

I had used YNAB for years. I paid for it. The thought “why pay for something I can build myself” kept growing. Less than a year ago I tried to build Solon, a personal finance startup. Back then the project hit the real cost of development.

Today, in 2026, I took the same project from zero to production in a few days. This is not a self-congratulatory anecdote about speed. It is a premise that changes the calculation for a founder or CTO building internal tools.

2025 · autumn

Solon

The same idea. It broke on time and cost. Buried.

2026

BR-Budget

MVP od zera do produkcji w kilka days. Inna technologia, inny tryb pracy.

The architectural decision that changed everything.

“Why am I building advanced filters and reports if the agent will soon generate them on demand?”

I started classically, UI-first. Screens, transactions, categories, reports. In the middle I stopped and shifted priority. API-first. Agentic-first.

  • API with full CRUD for every object
  • Per-user API keys connected to Claude Desktop, ChatGPT and Hermes
  • Inherited permissions; the API key sees exactly what the user sees
  • The UI remains for control, correction and trust, but it does not have to be the daily interface
BR
BR-Budget Agentonline
Show transactions that look recurring.21:08 ✓✓
I found 7 potential subscriptions:
Netflix, Supabase, Lovable, OpenAI and 3 others.

Total: ~PLN 643 / month.21:08
I am planning to spend PLN 1,500 on a lawn mower. Can I afford it?21:09 ✓✓
Available balance: PLN 124,307.
After the purchase, there is still a 4-month fixed-cost reserve.

Yes. But I am adding it to Pause for 2 days.21:09
Message...
II · Proof

I really run these queries with my agent through the BR-Budget API.

The agent reads balances, categorizes expenses and suggests decisions without opening the dashboard.

Claude DesktopChatGPT Custom GPTHermes
Screen 01Agent ↔ API · real queries from my usageTelegram mock

What you get when you log in.

“This is not hello world. It is a YNAB-class app with Polish context and an agentic-first API.”

Zero-based budgeting, alternatywny tryb lekkiej kontroli, the Pause module, atomic transfers, statement imports from ING and Nest Bank, AI classification with fallback, and refunds attached to the original category.

BR-Budget dashboard with balance, limits and categories
01 · DashboardBalance, income, expenses, limitsPLN 124,307 across 6 accounts. Category structure, budget mode and KPIs in one view.
BR-Budget, budget categories
02 · CategoriesZero-based or lightweight controlWants, Needs and Savings groups. Share, tracking and category editing from the right panel.
BR-Budget, transaction list
03 · Transactions176 entries from real usageGrouped by day, filters, search, export and transfers as a separate type.
BR-Budget Pause module — cooling-off before a bigger purchase
04 · PauseCooling-off before a bigger purchaseYou enter the item, amount and the date when you can buy it. The system blocks the impulse. This is BR-Budget's main differentiator.

Inside: the decisions that matter.

“Experimental application, but not a hobby app.”

Amounts are stored in grosze. A transfer has a shared transferId and outflow/inflow roles. Bootstrap loads snapshot, settings and import coverage in one request. Agent API runs in a separate authorization scope.

BR-Budget dashboard with technical annotations
Amounts in grosze“PLN 124,307.53” stored in the database as 12,430,753. Integer math, no rounding errors.
Endpoint /api/bootstrapBalance, income, expenses, limits, settings i import coverage w jednym requestcie.
Import pipelineCSV + MT940 for ING and Nest Bank. Duplicates detected by transaction hash.
Agent API · keysPer-user keys, /api/agent/* scope, bearer auth outside the Clerk session.
Zero-based · groupsEvery zloty has a category. Wants, Needs and Savings in one domain model.

Stack chosen for iteration speed

Next.js 16React 19TypeScriptTurso · libSQLClerk AuthRechartsREST · OpenAPIPlaywright E2E

Speed is not the point.

“AI in the development loop changes build speed by an order of magnitude. The point is what you do with that speed.”

The MVP was built faster than would have been possible even a year ago. But BR-Budget is not really about speed. The most interesting question is: what suddenly becomes worth testing when a prototype with a real domain can reach production in a few days?

Gamifying savings is a classic trap. People do not need points to save. They need fewer decisions.
What I cut after the first weeksAct III
IIIWhat breaks

What the experiment showed.

The first version had points, streaks and rewards for staying on budget. I cut it after the first weeks. It did not work.

I replaced it with the Pause module. Bigger purchase? You enter the item, amount, store link and answer a control question. The system calculates a decision lockout proportional to the amount.

A cooling-off period is a real behavioral mechanic, not fake gamification.

Pause · cooling-off moduleBR-Budget, Pause module
Locked down · #2 / 2

Large air purifier

1,700.00 PLN
02
days
23
hrs
14
min
07
sec
Control question
“What exactly will change in my life if I buy this?”

I am currently building Pay Yourself First: automatic income detection, suggested saving amount, execution verification and adaptive recommendations. Decision elimination, not motivation through points.

Mobile
od pierwszego daysa.

Because the agent reads through the API, but the human sometimes looks. When they do, they look on a phone.

BR-Budget mobile dashboard
Dashboard
BR-Budget mobile, adding a transaction
Quick transaction
BR-Budget mobile, transaction list
176 entries
BR-Budget mobile, Pause module
Pause
BR-Budget mobile, accounts
6 accounts
BR-Budget mobile, categories
Zero-based
Next

The system informs the agent, not the other way around.

Today the agent asks BR-Budget: “what happened?”. The next step is more interesting: BR-Budget tells the agent that something happened.

Income arrived. A subscription renewed. A category crossed its limit. An expense appeared that looks impulsive.

This is the moment when the app stops being a place I check. It starts being a system that speaks up when it has a reason.

BR-Budget
data · rules
TOMORROW · push
“income just arrived, time to save”
TODAY · pull
“what were my expenses in May?”
Agent
LLM · tools
→ 01
Behavioral

Pay Yourself First

  • Automatic income detection
  • Suggested amount to save
  • Execution verification
→ 02
Agent communication

Push, system to agent

  • Income and expense signals
  • Recurring subscription renewals
  • Category or limit overrun
→ 03
Context

A fuller picture of finances

  • Assets, earnings, liabilities and goals
  • Warranty tracker from receipts
  • Statement import from more banks

This is not a client case study.
It is a manifesto.

I am building BR-Budget because I want to test whether software can be designed differently: less as a place to click, more as a system of data, rules and decisions that an agent can safely work with.

I may be wrong. I am building to find out.

Are you building a business where
software stops being a product for humans?