Why your chat messages send instantly, even with no signal
Open your messaging app on a plane, type something, hit send. The message shows up right away with a little clock icon next to it. You land, your phone reconnects, and a second later that clock turns into a checkmark. The person on the other end gets it like nothing happened.
That little clock is doing more work than it looks like.
I want to walk through what's actually going on there, because the design behind it is the same thing that makes the app feel fast even when you do have signal. This is system-design level, not a code tutorial. If you're building anything that has to run on a phone, the ideas carry over.
"Online" is a lie you design around
Mobile networks drop constantly. Not just airplane mode: elevators, parking garages, the gap between your wifi cutting out and LTE kicking in, a packed stadium where everyone's phone is fighting for the same tower. Even a "good" connection has dead seconds in it.
So here's the first decision good chat apps make. Never let the network block the UI. If sending a message meant waiting for a server to respond before showing it on screen, every message would feel laggy by [the round-trip time you measure on your own backend, often 100–300ms]. On a bad connection it'd feel broken.
The fix is to lie a little, on purpose. You show the message immediately, save it locally, and handle the actual sending in the background. People call this optimistic rendering. The user sees success before success has technically happened, and you reconcile later if it turns out it didn't.
(This is why I think "offline support" is a slightly misleading name. You're not bolting on a special mode for when the network's gone. You're building the normal path, and "offline" is just the case where that background send takes a while.)
What happens the moment you hit send
Roughly, in order:
The message gets written to local storage on the device. Right then. Before anything touches the network.
It appears in the conversation in a "pending" state (that clock).
It goes into a queue, usually called the outbox, of things waiting to go out.
A background worker tries to flush that queue whenever there's a connection.
Notice the network only shows up at the last step, and even then nothing on screen is waiting on it.
[DIAGRAM: user types → write to local DB → render optimistically → add to outbox → background sync to server]
Where the message actually lives
On-device, this is almost always SQLite under the hood. On Android most people use Room (which sits on SQLite); on iOS it's Core Data or SQLite directly, or something like GRDB. Some teams write their own append-only log. The exact pick matters less than one property: it has to survive the app dying.
If someone force-quits the app, or the battery hits zero, or the OS kills your background process to reclaim memory, that pending message can't vanish. So it gets written to disk, not just held in a variable in memory. (If you're on SQLite, turning on WAL mode helps here, since it handles concurrent reads and writes better while you're flushing the outbox in the background.)
Durability is the whole game. A message the user believes they sent has to still be there after a crash, or you've broken the one promise chat has to keep.
The outbox, and the duplicate problem
The queue itself is usually just a table. Each pending message has a status, a created-at time, and an attempt counter so you can back off after repeated failures.
Here's the part people get wrong the first time around: every message needs a client-generated ID. A UUID made on the phone, before it's ever sent.
Why? Picture this. You send a message. The server receives it and saves it. But the response coming back to your phone gets lost (network died right at that moment). Your phone never heard "ok, got it," so it retries. Now the server has seen the same message twice.
If the server is handing out IDs, those two attempts look like two separate messages and your recipient sees a duplicate. If the client assigned the ID up front, the server can go "oh, I already have [that ID], ignore this retry." That's idempotency, and it's the thing that lets you retry without fear. I'd treat client-side IDs as non-negotiable for anything with a queue and retries behind it.
Coming back online
When the connection returns, two things have to happen, and they're separate jobs.
You flush your outbox: send everything that's been waiting, in order, retrying with backoff if the server's still unreachable.
You pull down what you missed: messages other people sent you while you were dark. This needs some kind of cursor, a sync token or a "give me everything after sequence number N" request, so you're not re-downloading the entire history every time you reconnect.
Then you merge. Your local pending messages, plus the server's version of the world. This is where ordering gets interesting, which I'll get to.
[DIAGRAM: reconnect → flush outbox + fetch since-cursor → merge → re-sort conversation]
Sent, delivered, read are three different things
People lump these together, but each one is a different signal traveling a different distance.
Pending: it's on your device, the server hasn't confirmed it (the clock).
Sent: the server has it. One check.
Delivered: it reached the recipient's device. Two checks.
Read: the recipient actually opened the conversation. Blue checks, or "Seen."
The jump from sent to delivered to read is worth sitting with. Those last two only happen because the recipient's device sends an acknowledgment back up to the server, which then relays it to you. Delivered means their phone received it, not that they looked at it. Read means an app on their end fired a "they've seen this" event. (Plenty of apps let people switch read receipts off, which is a product and privacy call, not a technical limit.)
[DIAGRAM: state machine — pending → sent → delivered → read, with "failed" as a branch off pending]
Media is a different animal
Everything above assumes text, which is tiny. A text message is a few hundred bytes; you can queue thousands of them without thinking about it.
A 40MB video is another story. You don't want that sitting inline in your message queue, and on a flaky connection it'll die halfway through more often than not.
The usual approach is two phases. First, upload the media to blob storage and get back a handle or URL. Then send the actual chat message, which just points at that handle. The message is small again; the heavy part happened on its own track.
For the upload itself you want something resumable, because re-sending 40MB from scratch every time the connection hiccups is miserable. S3 multipart uploads, or the tus protocol, let you pick up where you left off. And you generate a thumbnail on the device right away so the UI can show the photo immediately, even while the full-res version is still crawling up at [your users' typical upload speed].
Ordering, and why nobody fully agrees what "first" means
Two people in a group both send a message at "the same time." Whose lands first?
The tempting answer is to sort by timestamp. The problem is that phone clocks lie. People have the wrong time set, time zones drift, clocks are off by minutes. I've seen messages show up [with a timestamp from the future / dated to last year] because someone's device clock was just wrong. You can't trust the client's idea of "now" for ordering.
So most chat systems let the server be the source of truth. The server stamps each message with an authoritative sequence number per conversation as it arrives, and clients re-sort on that when they sync. Locally you still show your own message instantly by your own clock (optimistic, again), and when the server's ordering comes back, the list quietly settles into the real order.
That's eventual consistency in plain terms: for a moment, two people might see a slightly different order, but everyone converges on the same view once things sync. Not instantly identical. Identical soon.
You might've read about vector clocks or CRDTs for this kind of problem. My honest take: most chat apps don't need them. A server-assigned sequence per conversation is simpler and good enough, because chat doesn't actually require two offline devices to merge edits without a referee. Save the heavy distributed-systems machinery for when you genuinely have no authority to lean on. (If you're building peer-to-peer with no central server at all, that's a different conversation, and I'm not going to pretend I'd get those details right off the top of my head.)
The tradeoff you're really making
All of this balances two things people want at the same time: messages that feel instant, and messages that never silently disappear.
You get the instant feeling from optimistic rendering. You get the reliability from writing to disk, retrying, and acknowledging at every hop. The model that makes both work in practice is at-least-once delivery plus idempotency: keep trying until it's confirmed, and make duplicates harmless so retrying is always safe. Exactly-once delivery sounds nicer, but it's brutally hard to actually guarantee, and you don't need it once your IDs make duplicates a non-issue.
The one rule I'd hold onto: the optimistic UI is allowed to lie about success, but it has to be honest about failure. If a message truly can't be sent after every retry, show it. A red exclamation, a "tap to retry," anything. The worst outcome is a message the user is certain they sent that quietly never arrived.
Why go to all this trouble
Build it this way and the whole app stays useful with no signal. You can open it, read your history, write replies, queue them up, and everything flushes when you're back. No spinner holding the screen hostage.
And the part that caught me off guard the first time I built it: it makes the app faster even on a perfect connection, because you've stopped waiting on round trips to render anything. The offline machinery and the "feels snappy" machinery turn out to be the same machinery. That's the real reason to design offline-first, even when most of your users are online most of the time.