01
May 4, 2026
Building Spooool as a Central Cmd for Video
A YouTube alternative on Cloudflare-only infrastructure
Spooool is a video host where every piece of the runtime lives on Cloudflare: frontend, API, storage, encoding pipeline, rate limiting, fan-out. No origin server, no Postgres, no Redis, no S3. One wrangler deploy is the whole release.
A diversified content infrastructure is more important as of late due to algorithmic censorship, AI account-level errors, and issues contacting customer support. Cloudflare is set up in such a way that it lends itself to the development of a central command for streaming and publishing video content to various platforms.
My Solution
Let’s walk through how the interesting parts actually work, because most “build X on Cloudflare” posts stop at a Worker that returns JSON. The interesting bits here are the seams: chunked uploads to R2, a queue-driven encoding handoff to Stream, a Durable Object per channel for subscriber fan-out, and a token-bucket rate limiter that’s also a DO.
The bindings, in one place
Everything starts in wrangler.toml. The platform surface is the architecture:
[[r2_buckets]] binding = “VIDEOS”
[[d1_databases]] binding = “DB”
[[kv_namespaces]] binding = “CACHE” # hot reads
[[kv_namespaces]] binding = “SESSIONS” # multipart upload state
[[queues.producers]] binding = “VIDEO_ENCODING”
[[analytics_engine_datasets]] binding = “ANALYTICS”
[[durable_objects.bindings]] name = “CHANNEL_SUBSCRIBER_DO”
[[durable_objects.bindings]] name = “RATE_LIMITER”
[triggers] crons = [”0 2 * * *”] # GDPR sweep
A single Worker (src/workers/index.ts) owns fetch, queue, and scheduled handlers and routes via Hono. Static assets are served from [assets] with run_worker_first for /api/*, /watch/* (so OG tags can be injected via HTMLRewriter), and SEO endpoints.
Chunked uploads straight to R2 from the Worker
The non-obvious bit: spooool doesn’t use pre-signed URLs.
Chunks are POSTed as multipart/form-data to the Worker, which proxies into R2’s multipart upload API. State for an in-progress upload lives in KV (SESSIONS), not D1 — there’s no row until the upload completes.
The flow in src/workers/videos.ts:
// chunkIndex === 0: open the multipart upload, stash uploadId in KV
const multipart = await env.VIDEOS.createMultipartUpload(r2Key, {
httpMetadata: { contentType: rawFile.type },
});
const firstPart = await multipart.uploadPart(1, rawFile.stream());
await env.SESSIONS.put(mpidKey, multipart.uploadId, { expirationTtl: 86400 });
// chunkIndex > 0: resume and upload by part number
const multipart = env.VIDEOS.resumeMultipartUpload(uploadMeta.r2Key, multipartUploadId);
const uploadedPart = await multipart.uploadPart(chunkIndex + 1, rawFile.stream());
// last chunk: complete and enqueue encoding
await multipart.complete(completedParts);
await env.VIDEO_ENCODING.send({ videoId, r2Key });
Three KV keys per session — :mpid, :meta, :parts — all 24h TTL’d so abandoned uploads garbage-collect themselves. Cumulative byte counting on each chunk enforces MAX_VIDEO_BYTES without reading R2.
A subtle detail: the per-user rate limit only fires on chunkIndex === 0. Subsequent chunks are unbounded — they’re already paying for an established session, and rate-limiting them would just turn slow networks into failed uploads.
Encoding: a queue, not a request
Once multipart.complete() returns, the Worker sends a message to the VIDEO_ENCODING queue and replies 201 to the client. The same Worker is also the queue consumer — declared via export default containing a queue handler:
async queue(batch: MessageBatch<unknown>, env: EnvBindings) {
for (const message of batch.messages) {
try { await handleEncodingMessage(env, message.body); message.ack(); }
catch { message.retry(); }
}
}
handleEncodingMessage either submits the R2 object to Cloudflare Stream (POST /accounts/:id/stream with url: “r2://...”) or marks it pending_encode for a future R2-only encoding path. Stream then calls back into /api/webhooks/stream, which updates the row’s stream_video_id and status. The API surface and the encoder are the same Worker — but they’re different events on the same script, so a slow encode never blocks an upload response.
A Durable Object per channel, for fan-out
When a creator uploads, every subscriber needs an inbox row. With 10k subscribers and concurrent uploads from the same creator, naïve fan-out stampedes D1.
ChannelSubscriberDO is keyed by channel:${userId} — one instance per creator — and uses blockConcurrencyWhile to serialise fan-out for that creator. It pages subscribers in batches of 200 with cursor pagination on the subscriber_user_id index, and writes to subscription_inbox with ON CONFLICT DO NOTHING so retries are safe:
const id = ns.idFromName(`channel:${payload.channelUserId}`);
const stub = ns.get(id);
await stub.fetch(’https://channel-do/fan-out’, { method: ‘POST’, body: ... });
Note triggerFanOut is best-effort — its errors are logged, not surfaced to the upload response. If the DO is down, the upload still succeeds; fan-out can be re-triggered later.
Rate limiting, also a Durable Object
Token-bucket rate limiting on Cloudflare is a great DO use case: you need a single source of truth per (bucket, identity) and you need atomic decrement. KV is eventually consistent and would let two concurrent requests both observe a full bucket.
RateLimiterDO stores { tokens, lastRefillMs } and computes the refill on read:
const refilled = Math.min(capacity, startTokens + (elapsedMs / 1000) * refillPerSecond);
const allowed = refilled >= cost;
const tokens = allowed ? refilled - cost : refilled;
blockConcurrencyWhile serializes take() per identity. Capacity and refill rate are passed in from the caller, so changing policy (”auth writes are now 30/min not 10/min”) doesn’t require a DO migration — only the Hono middleware in rate-limit.ts changes.
Caching: KV with a version bump, not invalidation
Trending videos are cached in KV. Instead of deleting cache keys when a video is uploaded or deleted, spooool bumps a version counter:
const version = await getTrendingCacheVersion(c.env.CACHE);
const cacheKey = trendingCacheKey(version, limit);
Old keys remain and expire on TTL; new requests miss against the new version and repopulate. This dodges the eventual-consistency problem of trying to delete a KV key globally and then immediately reading it back.
What the cron does
The scheduled handler runs runDeletionSweep (hard-deletes users past their 30-day GDPR grace window) and runDmcaRestoreSweep daily at 02:00 UTC. Wrapped in ctx.waitUntil so a slow sweep doesn’t block the next tick.
What’s worth stealing from this design
Three things I’d lift into other Cloudflare projects:
Workers as both producer and consumer of their own queue. Same script, separate event types — no second deployment, no service binding indirection. Treats encoding as a different event on the same code, not a different service.
DO-per-entity for any “serialize per X” problem. Channel fan-out and rate limiting look unrelated; both are “atomic op against a per-X state machine” and both are 100 lines of DO code.
KV state for in-progress workflows. D1 rows only exist for completed entities. Half-finished uploads, encoding sessions, and other transient state belong in KV with a TTL — your relational schema stays clean.
The whole stack — frontend, API, encoder, fan-out, rate limiter, scheduled jobs — is one TypeScript codebase, one deploy, and (per individual cost analysis) somewhere between $0.50 and $15/month for a non-trivial test load. That’s the actual sell of building on a single platform: not free tier forever, but the architectural seams are bindings, not network calls.
Repo: Spooool