Nestful has quite the interesting use case – it is an offline first app, yet it supports cross-device sync. This means it should somehow consolidate data even if it’s the result of multiple different offline sessions coming back online at different times.
Most of you are already aware that the most common way to do this today is Conflict-free Replicated Data Types (CRDTs). As their name suggests, they are replicated data structures that can be combined, with algorithms guaranteeing that the data will eventually converge.
Nestful uses Yjs, a fast JavaScript CRDT. Yjs is truly a magnificent piece of software. Each user has its own Yjs Document where most data is stored. For local persistence that data is synced into IndexedDB, the in-browser database, using y-indexeddb, which is part of Yjs’ provider ecosystem.
Providers are pluggable pieces of code that hook into a Yjs document for added functionality. This is mostly in the form of sync to and from various places, from our local IndexedDB database to a middle ground in the form of y-websocket, to a full-blown solution like y-sweet.
Although I would have been glad to find a ready-made provider and integrate that into Nestful, things were not so easy. Commercial providers like y-sweet were in their infancy, so I didn’t even bother evaluating them. The most battle-tested solution was y-websocket, but that was just the communication layer. I would have had to write a non-negligible amount of backend code, which development (and maintenance) bandwidth did not allow for.
What do we do, then? To understand our options we need to figure out where Nestful was at the time, and what it is that we need to sync.
Nestful’s legacy
Historically Nestful was written as a generic, simple CRUD app using PostgreSQL via Supabase. I thought to myself – it’s a personal-use todo app, why would I need anything else? I’d say that for most todo apps that decision would have been correct (and it surely was correct at the time, for other reasons). The thing is, Nestful has two very important traits that change most of everything:
Nestful is offline-first, which now requires constructing a sync mechanism
Nestful uses a tree-structure which the traversal of can cause significant recursion
Don’t let anyone tell you otherwise – both are perfectly solvable and maintainable using PostgreSQL, even if they lend themselves better to something like a document-based CRDT.
Nestful’s other circumstances, including the way it was coded and my annoyance with my then local DB (a story for another time about how not to maintain FOSS) pushed me to switch.
So now we know that Nestful is hosted on Supabase, relying on its Auth and its Database, and most important of all – Nestful does not deploy any backend. This means that adding one will not only incur the complexity of the functionality we want to add, but of the expanded codebase, its deployment, and the server maintenance.
Doing that is all well and good, we are a software company after all, but let’s do better.
Update here, update there
That way sync works with Yjs (and most other CRDTs) is by using an update mechanism. Yjs allows the developer to obtain binary updates, representing a data diff, then send those over the wire. There are 3 main ways to do that:
Listen to changes
Diff against another version of the data
Get everything
Those updates can then be consumed in any order and they will converge to the same final state. Now that we know we want to sync updates, the questions are where do we save them and how will they get there.
Remember Nestful using Supabase? Well, Supabase has an S3 equivalent called Supabase Storage. The other side of the coin of provider lock-in is of course, blissful integration. After a minimal setting of permissions, end users would be able to push updates to Storage, and fetch updates that are already there.
Of course, just fetching all the updates all the time can be wasteful (and slow), so we’ll have to do some tweaking.
Almost converged
To not have the client download a humongous amount of files, which is as slow over these “serverless” platforms almost as how stupid the term “serverless” is, we could write a small “serverless” function to consolidate the updates for us.
That function will receive a date from the client, and will send over a single file.
When a client wants to get the latest data, he sends a date to the just-a-function-not-a-server. Upon receiving the combined update, the client diffs its own state against it, and uploads a diff directly to the just-object-storage-not-a-server.
This is called a full sync. At this point, the client and server are fully in sync. Any further update on the client is:
Uploaded directly to object storage
Broadcasted to peers using Supabase, to avoid a roundtrip
Using this scheme also has the added benefit of being able to restore data to a point in time, even if it was garbage collected by Yjs.
And that’s how Nestful syncs your data. This simple, client-dependent approach serves us well. It works, is stable, low maintenance, but… it’s slow. Nestful does not in fact require fast sync. It is not a real-time collaborative platform like other products using Yjs, and so sync can happen in the background, take a few seconds, and no one would notice.
But… we can’t settle for that, can we?
The future ahead
Even though it is fine for now (and for a long time from now) that a full sync is slow, we may need a bit more performance in the future. Although this may require getting an actual server and setting up an actual backend, the current design will stay mostly the same.
Since end-to-end encryption is planned for Nestful, the server will not be able to consolidate updates. Instead, a client will have to do that periodically and upload an encrypted checkpoint, which will serve as the new starting point for the next consolidation.
For this we’re going to need to cache both the metadata and data since the latest checkpoint for the update files in object storage to be able to quickly and efficiently serve whatever is needed, but that is for the future.
Check out the marvel of CRDT syncing by trying Nestful now, and also try Nestful itself while you’re at it.
Frequent readers will remember Nestful is written using Gleam and will probably wonder how we call Yjs from inside our lovely Gleam code. Here’s a YGleam link for you.
You may also be interested in our latest story about moving to an Elm-like state management.