MARTY BUROLLA
ROXBOX Home Screen

ROXBOX

ROXBOX is a premium, high-performance Progressive Web App (PWA) designed for private music management and playback. Unlike standard streaming services, ROXBOX is built on a "Light Proxy" architecture that bridges cloud availability with local extraction power. It allows users to manage a massive S3-backed library while leveraging home hardware for heavy lifting like audio extraction and AI indexing.

The system is designed for audiophiles and fitness enthusiasts alike, featuring built-in workout timers (Norwegian 4x4) and a glassmorphic aesthetic that feels premium on any device.

System Architecture

The ROXBOX ecosystem is built on a sophisticated multi-tier architecture that prioritizes security and performance. By leveraging a Cloud-to-Home Bridge via Tailscale, the system maintains the global availability of a serverless application while utilizing the heavy-lifting capabilities of local hardware for extraction and AI processing.

ROXBOX System Architecture
NextJS Application

NextJS Application

The frontend is built with Next.js 15 and React 19, utilizing the App Router for a bifurcated architecture. The "Cloud Bridge" consists of stateless API routes that handle authentication, library manifest aggregation, and request proxying.

By using a PWA approach, ROXBOX offers an "installable" experience on iOS and Android without the overhead of native app stores. High-performance styling is achieved through Vanilla CSS and Glassmorphism, ensuring smooth 60fps animations even on older mobile hardware. Sentry telemetry is integrated across the entire stack for real-time observability and distributed tracing.

ROXBOX Phone Library
AWS S3 Cloud Storage

AWS S3 Cloud Storage

ROXBOX leverages AWS S3 as its primary storage engine, treating the cloud as a highly available "Source of Truth." A central manifest.json file tracks the entire library metadata, allowing for near-instant library browsing regardless of size.

Audio files are streamed directly from S3, while localforage (IndexedDB) provides a persistent client-side cache for offline playback. This "Cloud-First, Offline-Capable" strategy ensures your music is available whether you're on a gigabit connection or in a dead zone.

AI-Driven Similar Search

To solve the problem of music discovery in private collections, ROXBOX implements an AI-powered Acoustic Fingerprinting system. When a new song is added, an AWS Lambda (containerized with FFmpeg and Python) extracts its acoustic features and stores them as high-dimensional vectors in Pinecone.

The Next.js frontend can then perform real-time vector similarity searches, allowing users to find "songs that sound like this" with sub-second latency. This turns a static collection into a dynamic, searchable ecosystem.

Similar Search Results 1Similar Search Results 2

YouTube Audio Extraction

The heavy lifting of audio extraction is delegated to a FastAPI service running on local hardware (The Home Engine). When a user searches YouTube in the PWA, the request is proxied through an AWS Lambda bridge into a Tailscale Mesh tunnel.

The Home Engine uses yt-dlp for high-quality extraction and ffmpeg for MP3 conversion. Once processed, files are automatically uploaded to S3, triggering the cloud indexer. This "Light Proxy" setup provides the security and power of a home server with the global accessibility of a cloud app.

Settings and Extraction ConfigYouTube Search

Fitness: Norwegian 4x4 Integration

Beyond simple playback, ROXBOX is designed as a performance training tool. It features a native implementation of the Norwegian 4x4 High-Intensity Interval Training (HIIT) protocol.

The app automatically manages your workout timers—Warm-Up, 4-minute Burn intervals, and 3-minute Active Recovery sets—and synchronizes them with specific playlists in your library. This eliminates the need for external timer apps and ensures your highest-energy tracks hit exactly when your heart rate needs to peak.

Norwegian 4x4 Warm UpNorwegian 4x4 Burn SetNorwegian 4x4 Rest Set
Podcasts & Long-Form Audio

Podcasts & Long-Form Audio

ROXBOX extends its management capabilities to podcasts and long-form audio. By utilizing the same S3-backed architecture, users can curate their own podcast feeds and audiobooks. The app maintains playback position and provides specialized controls for navigating longer tracks, ensuring a seamless transition between high-energy music and informative spoken word content.