Yewe
About
← Back
June 15, 2026·18 min read

Building a backend we can't read

zero-knowledgefintechtotalscryptographyarchitecture

This blog walks through how we built a backend that makes it structurally impossible to read its users' data, not a "trust me bro".

We just shipped shared expenses in Totals. Split a dinner, track who owes who, settle up. The usual stuff. The unusual part is that our server holds zero plaintext: amounts, names, group contents, all encrypted on the device before they ever reach us. If our database leaked tomorrow, the leak would contain nothing readable.

What follows is what it took, decision by decision.


The constraint

Totals is a personal finance app. It reads your bank SMS notifications on your phone, builds a unified ledger, lets you budget, tracks loans, and so on. That part has always run entirely on-device. No server. The architecture is honest: the data is on your phone because there's nowhere else for it to be.

In version 1.5 we shipped shared expenses. Split a dinner with friends, track who owes who, settle up. The classic Splitwise use case. To do this you need a server, because the whole point is that multiple devices have to agree on a shared ledger. Coordination requires a middle.

The interesting question is what that middle is allowed to know.

Splitwise's middle knows everything. Amounts, names, what the dinner was for, who paid, who still owes. That's their architecture; it's not a secret. You sign up with an email, they store your splits, they're the system of record. If their database leaks, the people in your groups and the contents of your splits are public. If a government subpoenas them, they hand over whatever they have, which is everything. That's the normal model.

The constraint I wanted was different. The server should be unable to know any of those things. Not "trust me bro". If the database leaks, the leak contains nothing useful. If we get subpoenaed, the honest answer is that we don't have the requested data, not as a policy choice, but as a property of how the system is built.

This is the difference between trust-me-bro architecture and trust by architecture. The first can be quietly violated. The second either preserves a property or it doesn't. The difference is auditable.

Choosing architecture over "trust me bro" is the kind of decision that sounds nice and gets harder every step downstream. Every feature you add becomes a test: can we do this without breaking the property? Most of the obvious implementations break it. Most of the obvious shortcuts break it. You spend a lot of time saying "no, that doesn't work, the server would see X" and a lot of time finding the harder answer.

What follows is a walk through the actual pieces, with the obvious wrong answers and the right answers next to each other.

Server sees
  • device public keys
  • group IDs (random UUIDs)
  • encrypted payloads (opaque)
  • payload size + timestamp
  • push notification tokens
  • aggregate stats
Server never sees
  • emails, phones, names
  • group names
  • member display names
  • expense amounts + reasons
  • decrypted payload contents
  • recovery PINs
  • identity vault contents
The server holds metadata. The contents are unreadable to it.

The next few sections walk through how each row on the "never sees" side is enforced. We'll come back to demos that make this concrete once the concepts they use have a name.


Identity without accounts

The first thing a normal backend wants is an account system. Email, password, maybe a phone number, an ID it can use to reference you across requests. You can't do anything without identifying who you are.

We don't have any of that. There's no signup. There's no email collection. There's no phone verification. There's no admin panel where I could look you up if I wanted to, because there's nothing to look up by.

What we have instead is per-device cryptographic identity. When you install the app for the first time, your phone generates an Ed25519 keypair locally. The private key never leaves the device. The public key (32 bytes of essentially random-looking data) is the device's identifier. Server-side, that's the only handle we have on "you."

Every authenticated request to our API is a challenge-signature dance. The client asks the server for a 32-byte random challenge, signs it with its private key, and submits the request with the public key, challenge, and signature in the headers. The server verifies the signature, marks the challenge as used (single-use; atomically claimed in a single SQL update, anything less invites a replay race), and the request is authenticated.

The challenge expires in 60 seconds. The signature is verified with @noble/curves (audited, pure TypeScript, the same library the mobile app uses on the Dart side). There's no session that gets stored. There's a 10-minute HMAC-signed bearer token issued after a successful handshake so the device doesn't have to re-sign every request on a hot path, but it's a UX optimization sitting on top of the real cryptographic root.

The thing this architecture gives you, by construction, is that we can't enumerate users. We don't have a users table. We have devices that appear when they sign their first challenge, and disappear when they stop showing up. If you ask "how many users does Totals have," I genuinely cannot tell you. I can tell you how many device public keys we have, but I have no way of knowing whether one human owns five of them or five humans own one each. Probably some of both. We don't know.

This was the first surprise. You realize fairly quickly that not knowing requires constant discipline. Every time someone proposes a feature, the natural design starts with "we'll just store a flag against the user record." There's no user record. The flag has to live somewhere else, or it has to encrypt itself, or it has to not exist.


Group membership the server can't read

A group is a place where multiple devices share state. Server-side, it's a UUID — random, opaque, meaningless without context. Devices that are members of a group have a row in group_members pairing their public key with the group ID. That's the entire server-side picture: a UUID, a list of public keys.

The group's name? Not on the server. The members' display names? Not on the server. The actual expenses? Not on the server in any readable form.

How does this work? Each group has a 256-bit random AES key generated by the device that creates the group. That key is the symmetric secret for everything in the group. It never leaves member devices, not even in encrypted form except during the brief moment of onboarding a new member.

The hard problem is getting the key to a new member who joins. The server can relay encrypted payloads between devices, but it can't relay a key in the clear — that would defeat the entire purpose. The solution is X25519 ECDH between Ed25519-derived public keys. When a new device joins a group:

  1. The server announces the new member's public key to existing members (server knows pubkeys; this is fine).
  2. An existing member computes a shared secret via X25519(own private key, new member's public key). The new member, independently, can compute the same shared secret via X25519(own private key, existing member's public key). Diffie-Hellman magic: same secret, derived in two places, never transmitted.
  3. The existing member encrypts the group key with AES-256-GCM using the shared secret, sends it as a payload through our server.
  4. The new member decrypts, gets the group key, and is now able to read everything in the group.

The server in step 3 sees an opaque blob it cannot decrypt. The shared secret is never on the wire. We just route the ciphertext.

EXISTING MEMBERNEW MEMBERX25519(my_priv, new_pub)→ shared secret SX25519(my_priv, existing_pub)→ shared secret Ssame Snever transmittedAES-GCM(S, group_key)relayed by server (opaque to it)decrypt with S→ has group_keyserver saw: one encrypted blob. nothing else.
Both members derive the same shared secret without it ever being sent.
Alice
Bob
Both must generate, then compute.
Real X25519, running in your browser. Same library the engine uses.

This is the kind of thing where the abstract description sounds elegant and the implementation has interesting failure modes. What if an existing member is offline when the new one joins? Their phone has to come back online before the key exchange can complete. What if a member leaves and you want them to no longer be able to read future content? You rotate the group key — generate a new one, redistribute it to remaining members the same way, and announce internally that the old one is dead. Former members can read history (they had the old key) but nothing new.

The server has no view into any of this. From its perspective, members come and go, encrypted payloads flow, life is opaque.

Now that the moving parts have names, the bootstrap demo. Alice creates a group, Bob joins, Alice approves him with a key_exchange payload, Bob derives the same shared secret and unwraps the group key. Watch what the server's column holds at each stage.

Alice and Bob want to share expenses. Walk through the bootstrap.
Alice (group creator)
device_pubkey:
Server (our gateway)
nothing yet — no group created
Bob (recipient)
nothing yet — hasn't joined
Alice creates group
Bob joins via invite
Alice sends key_exchange
Bob derives + decrypts
Real X25519 + AES-256-GCM. The group key never crosses the server in any form it could read.

And here's what every subsequent expense looks like flowing through that group. Edit the JSON, walk it through, watch the middle column stay opaque.

Alice composes (plaintext on her device)
Alice (sender device)
{
  "type": "expense",
  "amount": 1500,
  "currency": "ETB",
  "reason": "Dinner at Lucy",
  "paidBy": "alice",
  "splitAmong": ["alice", "bob"]
}
ready to encrypt
Server (our gateway)
nothing yet — payload not received
Bob (recipient device)
nothing yet — not delivered
Alice encrypts
Send to server
Deliver to Bob
Bob decrypts
Same AES-256-GCM the engine uses. Edit the expense, click through, watch the middle column stay opaque.

Pushing notifications without telling Google what they're for

Mobile apps don't get to choose how they wake up. To deliver a notification when something happens, you have to go through Firebase Cloud Messaging (Google) or APNs (Apple). Their servers see your traffic. If you put the notification body in the message, they see the body.

The naive implementation of "Alice just added a $40 dinner expense to your group" would send Google a notification that says exactly that, plus the group ID. Now Google has another data point about Alice and her friends.

What we send instead is a doorbell. A content-free, data-only FCM message that carries the group ID and the new payload's ID. That's it. No notification title, no body, nothing for the OS to auto-render. When the message arrives, the app wakes up in the background, pulls the actual encrypted payload from our server, decrypts it locally, and only then composes the user-visible notification. The lock screen text — "Alice added a $40 dinner expense" — is composed on the device from data only the device can read.

There was an earlier version where we tried to be clever. The app would precompute an encrypted notification preview blob and send it along with the payload. The server would put that preview blob into the FCM data field. The receiving device could decrypt the preview without doing a network pull and show a rich notification faster. It worked, but it was the wrong call: it exposed extra metadata in transit, made the FCM message non-trivially larger, and meant a separate code path for "if the preview decrypts, use it; otherwise pull." We ripped it out. Now FCM messages are uniform, tiny, and contain the minimum information the device needs to pull the real payload. Google sees that something happened in some group on some device. They've always known that. They don't see what.


Identity recovery: where the constraint really bites

This is the part that nearly broke. Imagine you lose your phone. Your private key was on it. Your group keys were on it. Without those, a new device cannot prove it's you, and cannot decrypt anything in the groups you were in. By the rules we've set up — the server can't help you, because the server has no idea what your private key was — you'd be out forever.

For some users, that might be fine. For most, it isn't. The whole point of a finance app is continuity. Telling people "lose your phone, lose all your groups forever" is a non-starter.

So we have to give users a recovery mechanism without breaking the constraint. We can't store their private key in a form we can decrypt. We can't store their group keys in a form we can decrypt. We can't ask for an email, because that brings the entire account-system problem back. What we can do is store an opaque, user-encrypted blob and give them back what they uploaded if they can prove they're the one who uploaded it.

The shape:

  1. On the device, the user sets a PIN. The PIN never leaves the device.
  2. The device uses Argon2id (a memory-hard KDF) with a random per-user salt to derive a key-encryption-key from the PIN. Argon2 is intentionally slow — a few hundred milliseconds per derivation — to make offline brute force on a leaked blob economically painful.
  3. The device builds a payload containing the private key, the group memberships, and the group keys. It encrypts that payload with the derived key. The result is an opaque blob the server cannot decrypt without the PIN.
  4. The device generates a random 16-character recovery code in Crockford base32 (~80 bits of entropy). It uploads (recovery_code, salt, kdfParams, encryptedBlob) to the server.
  5. The user writes down the recovery code somewhere they'll find it later.

On recovery from a new device:

  1. The user enters the recovery code. Server returns the sealed blob, the salt, and the KDF parameters.
  2. The user enters the PIN.
  3. The device re-derives the same key from PIN + salt, decrypts the blob, gets the private key and group keys back.
What the server stores
  • ├─ recovery_code // 16-char Crockford base32
  • ├─ salt // 16 random bytes per user
  • ├─ kdf_params // { memoryKb, iterations, parallelism }
  • └─ encrypted_blob // AES-256-GCM ciphertext
Inside the encrypted_blob (only the device sees this)
  • ├─ device private key
  • ├─ group encryption keys × N
  • └─ group memberships
KEK = Argon2id(PIN, salt, kdf_params)
encrypted_blob = AES-256-GCM(KEK, contents)
The PIN never reaches the server. Without it, the blob is opaque ciphertext.
Recovery requires both halves: code (server-held) and PIN (user-held).
Secret to back up (stand-in for a private key)
generating…
PIN
Argon2id + AES-256-GCM, in your browser. Try wrong PINs. Try tampering. Watch GCM catch it.

This is when my colleague's "just use a device identifier" suggestion came in. I want to walk through why that, and similar shortcuts, are not solving the same problem.

Option A: store a per-device identifier and back up against it, no PIN. This is what he proposed. The question to ask is "where does the encryption key live?" If the device generates a random key and stores it only locally, recovery is impossible — the key died with the device. If the device sends the key to the server, the server can decrypt every backup it holds; zero-knowledge is gone. If the user has to write down the key, that's just our recovery code with the PIN deleted: anyone who sees the code (sticky note, photo, shoulder surf) gets immediate access. The PIN exists because the recovery code will leak. It's defense in depth: one factor is something the user has, one is something the user knows.

Option B: tie identity to a phone number, recover via SMS. This was the next suggestion. The trouble is that the moment you have phone numbers, you have a subpoena-able list of users tied to their financial relationships, which is precisely what we were trying not to have. Plus phone numbers are notoriously fragile as auth roots — SIM swap is the standard attack on fintech that goes this route. Plus carriers recycle numbers. Plus this requires SMS verification infrastructure with its own costs and abuse vectors. The proposal sounds like a UX simplification; it's a brand-redefining architectural change.

Option C: rely on iCloud Keychain / Android cloud backup to round-trip the user's secrets through the OS vendor. This is interesting and not entirely wrong. Apple and Google already solve a version of this problem for password managers. But it ties the app to one vendor per platform, can't bridge between iOS and Android (a real Totals user moving from Android to iOS would lose access), and offloads the trust question rather than answering it.

The PIN-encrypted vault remains the only option that's both recoverable and zero-knowledge. The UX is rougher than "just biometric and you're in," and rougher than "we'll text you a code." Both of those alternatives quietly delete the privacy property. The PIN keeps it.


What we give up

I want to be honest about the costs, because it's easy to write "we built a zero-knowledge backend" and pretend it's free. It isn't.

If a user forgets their PIN and we get a support email asking for a reset, the honest answer is we can't. Their vault is sealed with a key derived from a secret only they had. We hold ciphertext we cannot decrypt. There is no override.

If everyone in a group is offline when you restore on a new phone, you see empty groups. The group's content — past expenses, member names, settled balances — lives on members' devices, not on our server. Restoring your identity gets you back into the groups, but the actual data comes from peers. If no peer is online for three days, your groups look empty for three days.

These read as bugs in the marketing version of the story. They're not bugs. They're the price of the property. You cannot have full data recovery with no user-remembered secret and a zero-knowledge backend. That's not a feature choice. That's a corollary of how cryptography works. Anyone who tells you otherwise is selling you "trust me bro" dressed up as trust by architecture.


Operational details that turned out to matter

A few things I didn't expect to be load-bearing until they were:

  • Encrypted payloads have a 30-day TTL.
  • Groups expire one year after the last payload.
  • A device can be in at most six active groups concurrently.

None of these are exciting individually. Each one was a moment of "wait, this property breaks if we don't handle the edge case." Building under a constraint means a lot of those moments.


On open source as load-bearing

The engine that powers all of this is open source, on GitHub. This isn't a marketing flourish; it's a structural piece of the trust argument.

A privacy policy can quietly change. A claim about zero-knowledge architecture can be quietly broken in a future deploy. The only way to make those claims checkable is to publish the code that runs the server. A user (or a journalist, or a researcher, or your mom) who wonders whether we really can't read their data can read the code and verify. A skeptical user can run their own instance and trust nobody.

This is also why the README explicitly notes that the same server has to be reachable by both ends of a shared group. A self-hosted Totals-Engine is its own island; groups on it don't federate with groups on ours. That's an honest tradeoff for a trust-by-architecture design — you can't have your group span trust domains, because that would require trusting both.


Why we chose the constraint

The reason to do this is not that it's clever. It's that the alternative is being one breach away from a brand-defining incident. Trust-me-bro fintechs live in a state where their security posture has to be unbreakable, because the moment their database leaks, every promise they made is broken at once. Trust-by-architecture fintechs don't have that fragility, because the leak doesn't contain the data the promise was about.

The pitch to users is symmetric. The version they were buying, "we won't read your data", was always asking them to trust us. The version we shipped, "we can't read your data", asks them to trust the math. The math has a much better track record than people's intentions.