Skip to main content
back to blogs

what we learned building a mobile ar recorder end-to-end

April 28, 2026 6 min read

how we built a cross-platform ar recording pipeline with deterministic sampling, mcap serialization, and reliable multipart uploads

this started as "just record ar"

i thought this would be easy.

open ar session -> capture frames -> save -> upload

done.

then reality hit:

  • camera -> around 30 fps
  • imu -> 100 to 200 hz
  • depth -> around 15 hz
  • tracking randomly pauses
  • ios and android behave differently

and suddenly:

nothing lines up.

the core realization

this is not a camera problem.

this is a time synchronization problem across multiple asynchronous streams.

system architecture

here is the actual system we ended up building:

svg

each layer has one job:

  • native -> capture
  • orchestrator -> decisions
  • writer -> structure
  • upload -> reliability

this separation is what made the system stable.

the core loop (this is everything)

this runs around 30 times per second:

kt

this loop is your ground truth generator.

deterministic sampling (the real unlock)

initially we did:

if a frame arrives -> record it

this breaks instantly.

the correct model

text

why this matters

svg

this gives you:

  • device-independent datasets
  • stable cadence (15 hz, 30 hz)
  • zero long-term drift

what ar frameworks actually give you

both arkit and arcore are doing slam:

  • visual tracking (camera)
  • inertial tracking (imu)
  • map reconstruction

they output:

  • pose (6dof)
  • camera frame
  • feature points
  • depth (if available)

but:

these are not synchronized streams.

the real problem: multi-rate data

svg

everything must align on:

timestamp, not frame index.

deep dive: mcap (why this was a game changer)

we moved from:

text

to:

text

what mcap actually is

mcap is a container format for heterogeneous timestamped data (mcap.dev).

its not encoding, it wraps multiple streams into one file.

why it exists

before mcap:

  • ros bags (hard to use outside ros)
  • sqlite logs (not self-contained)
  • custom formats (painful)

mcap solves this by being:

  • self-contained
  • multi-stream
  • indexed
  • append-only (Foxglove)

mcap mental model

svg

each stream = topic

each entry = timestamped message

actual file structure (internal)

svg

key concept: records

mcap is built from records:

  • schema -> defines structure
  • channel -> defines topic
  • message -> actual data
  • chunk -> batch of messages
  • index -> fast lookup

(Monday Morning Haskell)

why chunking matters

svg

your system:

  • android -> around 1 mb chunks
  • ios -> around 512 kb

this gives:

  • high write throughput
  • fewer disk ops
  • recoverable files

mcap's append-only design even allows recovery after crashes (Foxglove).

indexing (this is huge)

without index:

text

with index:

text

mcap supports:

  • topic-based lookup
  • timestamp-based seeking
  • partial reads over network (Segments.ai)

this is critical when:

  • files are 500 mb and larger
  • data is remote

serialization layer (ros2 + cdr)

your pipeline uses:

  • ros2 message schemas
  • cdr encoding

cdr (common data representation):

  • binary serialization format used by dds and ros2
  • ensures cross-language compatibility

so each message becomes:

text

ios vs android architecture

android (pull model)

kt

ios (push model)

swift

why this matters

| problem | android | ios | | -------------- | ------- | -------- | | timing control | easy | hard | | buffering | minimal | required | | backpressure | rare | common |

this is why:

  • ios needs queues and backpressure handling
  • android can stay simpler

upload architecture

naive approach

svg

actual system

svg

implementation idea

dart

this gives:

  • resumability
  • parallelism
  • reliability

hardest problems (real ones)

these took the most time:

  • imu and frame timestamp alignment
  • tracking loss compensation
  • deterministic sampling correctness
  • writer backpressure
  • storage exhaustion handling

these are invisible in demos. but define production systems.

final takeaway

svg

once you understand this:

  • sampling becomes obvious
  • mcap makes sense
  • uploads become solvable

closing

this started as:

lets record ar

it became:

  • real-time systems
  • data engineering
  • serialization design
  • distributed uploads

and honestly, thats what made it worth building.