Mimicra

Overview

Full-stack SaaS, solo,
from scratch in ~1 month

Async Python backend (FastAPI + SQLAlchemy async), Next.js frontend, Celery for background jobs, credit-based billing with two payment processors (card + crypto), passwordless OTP login, and a full admin panel. Same API endpoints work from the browser via cookie and from automation via Bearer token.

Guests can try synthesis without registration. Registered users get 5,000 credits and access to all providers. Users with personal API keys bypass billing entirely.

Synthesis Engine

Three synthesis modes

01 / Sync & Async modes

Submit, poll, or get notified via webhook

Synchronous: returns audio binary directly, plays in browser immediately.

Async with webhook: submit job → receive transaction_id → poll or get notified on completion. Webhook delivery with retry: exponential backoff, 3 attempts, failure recovery.

Async jobs Webhook callbacks Retry + backoff

02 / Split Scenes mode

Script → multiple MP3s → ZIP archive

Splits script by regex scene separator. Synthesizes each scene individually with the same voice. Returns a ZIP archive with numbered .mp3 files — one per scene. Designed for long-form narration and audiobook production.

Regex split Batch synthesis ZIP output

03 / Controls & HQ mode

Speed, pitch, volume · 2× cost for HQ

Speed, pitch, and volume passed as SSML parameters to Azure; speed to OpenAI. HQ mode applies a 2× cost multiplier and triggers higher-quality synthesis on supported providers. Text chunking via TextSplitter for texts exceeding provider limits.

04 / Multi-Provider Architecture

Provider factory — swap without touching synthesis logic

Unicorn

Proprietary provider. Multi-key routing: per voice template → key assignment. System credentials stored in DB.

ElevenLabs

Standard and white/grey-label API. 700+ voices including cloning.

Azure

SSML-based synthesis, multi-locale, full pitch/speed/volume control.

OpenAI TTS

6 voices, speed control. Clean natural output for English narration.

Minimax

Integrated internally, not exposed to end users.

Billing & Access

05 / 4-Level Role System

From anonymous guest to superadmin

Role	Access
guest	No registration. Unicorn only, max 500 chars, 5 requests/day per IP.
user	Registered. Credit-based. Access to all providers.
paid	Same as user, paid subscription status.
admin	Full access, zero cost, admin panel.

06 / Credit-Based Billing

Race-condition safe, auto-refund on failure

Every user starts with 5,000 credits. Cost = text_length × provider.cost_multiplier. Admins and own-key users pay 0. SELECT FOR UPDATE on the balance row prevents race conditions. Credits auto-refunded on failed generation. Credits expire after 365 days of inactivity.

WayForPay (card) WhitePay (crypto) SELECT FOR UPDATE Auto-refund 365-day expiry

07 / API Access

Same endpoints, two auth methods

Every registered user gets a personal ozv-XXXX API token. Auth supports both session cookie (browser) and Authorization: Bearer header (automation) — identical endpoints, identical logic. Rate limiting: 5 guest req/day per IP · 20 authenticated req/min.

08 / Two Celery Queues

Independent workers, independent scaling

priority_high

Paid users and priority jobs. Separate worker process.

default

Standard queue. Can scale workers independently of priority queue.

09 / Audio Lifecycle

Auto-cleanup, soft delete

Guest files deleted after 7 days. Registered user files after 14 days. Celery Beat runs cleanup daily at 3 AM UTC. Soft is_deleted flag before physical removal.

10 / User Preferences

JSONB, persisted across sessions

Last provider, voice, language, speed, pitch, theme, and UI language stored in a JSONB preferences column. UI restores full state on page load.

11 / Admin Panel

Full operational control

Block/unblock users, role changes, balance adjustments. Provider management: cost multipliers, enable/disable. Unicorn multi-key and template→key assignment. Global notifications and expense tracking.

Security

No passwords.
No plaintext credentials.

Passwordless login via OTP. All third-party keys encrypted. Login history tracked per session.

01

Passwordless OTP login — one-time codes via email through Resend. No passwords stored anywhere.

02

JWT cookie + Bearer token — ozv- prefix eliminates token ambiguity. Same endpoints handle both auth methods.

03

Rate limiting via fastapi-limiter + Redis on all public endpoints. IP + user agent registration fingerprinting for anti-abuse.

04

Login history tracking — IP and user agent recorded per session. is_blocked flag for instant suspension without deletion.

05

Sentry error tracking with traces, profiles, and PII enabled. Webhook delivery with exponential backoff retry.

Testing

Coverage areas

7 test
domains

Auth flow, token validation, cookie vs Bearer header

Admin security: role enforcement, superuser gates

Billing: hold, refund, insufficient funds, race conditions

Synthesis pipeline: sync, async, chunking, split scenes

Voice sync and provider credential management

TextSplitter edge cases and boundary conditions

Schema validation: HQ mode, advanced synthesis config

Stack & Infrastructure

Async all the way down.

FastAPI + SQLAlchemy async + asyncpg — no sync shortcuts. Traefik handles SSL automatically. Alembic runs migrations on every deploy. Celery Beat for scheduled cleanup and credit expiration.

Python

FastAPI

PostgreSQL

SQLAlchemy async

Alembic

Celery + Redis

Next.js

Docker

Traefik

Sentry

ElevenLabs

Azure TTS

OpenAI TTS

Resend

Full-stack SaaS, solo,from scratch in ~1 month

Three synthesis modes

No passwords.No plaintext credentials.

Async all the way down.

Full-stack SaaS, solo,
from scratch in ~1 month

No passwords.
No plaintext credentials.