RLFLOW — ARCHITECTURE v2.0

RLFlow — RLHF Platform

Full-stack end-to-end RLHF data collection & annotation platform. Same pipeline used to train ChatGPT and Claude.

🟢 Frontend (Live) 🟢 API Docs (Live) 🟢 Grafana Cloud
01

System Architecture Overview

Frontend (React + TypeScript)
Task Creator UI
Annotator Workspace
Metrics Dashboard
Export Manager
↕ ↕ ↕ ↕
API Layer (FastAPI + Python)
Auth / JWT + RBAC
Task API
Feedback API
Metrics API
Export API
Rate Limiting (slowapi)
Groq LLM Integration
↕ ↕ ↕
Worker Layer (Redis + Celery)
Task Assignment Queue
Quality Scoring Worker
Export Builder Worker
↕ ↕
Data Layer
PostgreSQL — Tasks, Feedback, Users
Redis — Queues + Cache
Upstash Redis (TLS) — Queues
JSONL Streaming Export
Observability
Grafana Dashboards
Prometheus Metrics
Docker + Docker Compose
Render (Production)
GitHub Actions CI/CD
Grafana Cloud
02

Core Feature Modules

🧩

Task Management

Researchers create coding or reasoning tasks with prompts, expected behaviors, and evaluation criteria. Tasks are versioned and tagged.

tasks_api.py
👥

Annotator Workspace

Annotators are assigned tasks from a queue. They submit reward signals, qualitative feedback, and binary preference labels for agent outputs.

annotations_api.py
🏆

Reward Signal Collection

Collects scalar rewards, pairwise rankings, and free-text rationales per task. Aggregates inter-annotator agreement scores.

feedback_api.py
📊

Quality Tracking

Monitors annotator consistency, task completion rates, feedback distributions, and statistical outliers in reward signals.

quality_worker.py
📈

Metrics Dashboard

Real-time Grafana dashboard showing task throughput, reward distributions, annotator performance, and dataset health.

metrics_api.py
📦

Dataset Export

Generates clean JSONL/Parquet files formatted for RL fine-tuning (RLHF, DPO). Filters by quality thresholds before export.

export_worker.py
03

PostgreSQL Database Schema

users
iduuid PK
emailvarchar
roleenum
created_attimestamp
tasks
iduuid PK
creator_iduuid FK
typeenum
prompttext
statusenum
metadatajsonb
task_assignments
iduuid PK
task_iduuid FK
annotator_iduuid FK
assigned_attimestamp
completed_attimestamp
feedback
iduuid PK
assignment_iduuid FK
reward_scalarfloat
rationaletext
quality_scorefloat
agent_outputs
iduuid PK
task_iduuid FK
model_idvarchar
outputtext
generated_attimestamp
export_jobs
iduuid PK
created_byuuid FK
formatenum
filtersjsonb
file_urlvarchar
statusenum
Use role enum: researcher | annotator | admin. Task type enum: coding | reasoning | qa. Export format: jsonl | parquet | csv.
04

FastAPI Route Map

POST/api/v1/auth/loginJWT auth — rate limited 10/min
POST/api/v1/auth/registerRegister user — rate limited 5/min
GET/api/v1/tasksList tasks with filters
POST/api/v1/tasksCreate new task
GET/api/v1/tasks/{id}Get task detail
POST/api/v1/tasks/{id}/generateGenerate LLM response via Groq
PUT/api/v1/tasks/{id}Update task
POST/api/v1/tasks/{id}/assignAssign to annotator
GET/api/v1/annotator/queueAnnotator's task queue
POST/api/v1/feedbackSubmit feedback + reward
GET/api/v1/feedback/{task_id}Feedback for a task
GET/api/v1/metrics/dashboardAggregated metrics
GET/api/v1/metrics/qualityQuality scores
POST/api/v1/exportsStream JSONL export directly to browser
GET/healthHealth check
05

Recommended Build Phases

Phase 1✅ Done
Docker Compose setup Postgres + Redis FastAPI skeleton Auth/JWT DB migrations (Alembic)
Phase 2✅ Done
Task CRUD APIs Feedback API Celery workers Assignment queue
Phase 3✅ Done
React + TypeScript frontend Task Creator UI Annotator workspace API integration
Phase 4✅ Done
Quality scoring Export pipeline Grafana setup Prometheus metrics
Phase 5✅ Deployed
Render deployment GitHub Actions CI/CD Upstash Redis (TLS) Rate limiting Grafana Cloud
06

Tech Stack Decisions

FastAPI + Pydantic v2

Auto-generated OpenAPI docs, async endpoints, strong request/response validation. Use SQLAlchemy 2.0 async ORM + Alembic for migrations.

backend
⚛️

React + TypeScript + Vite

Use React Query (TanStack) for server state, Zustand for UI state, React Hook Form for task creation forms. Tailwind CSS for styling.

frontend
🐘

PostgreSQL + Redis

Postgres for all relational data with JSONB for task metadata. Redis for Celery task queues and caching hot metrics queries.

data
🌿

Celery Workers

Async workers for: quality score computation, dataset export generation, annotator assignment, and notification dispatch.

workers
📉

Grafana + Prometheus

Instrument FastAPI with prometheus-fastapi-instrumentator. Build dashboards for reward distributions, task throughput, and annotator stats.

observability
🚀

Docker + Render

Docker Compose for local dev. Deployed to Render (API + Worker + Frontend). GitHub Actions auto-deploys on push to main. Upstash Redis with TLS for production queue.

infra
07

Security & Production Deployment

🔐

Auth & Access Control

JWT tokens with role-based access. Researchers and annotators have separate endpoints enforced at the API level. bcrypt + SHA-256 password hashing.

implemented
🚦

Rate Limiting

Login limited to 10 requests/min per IP. Registration limited to 5/min. Prevents brute force attacks. Implemented via slowapi.

implemented
🌐

CORS

Locked to frontend origin only. No wildcard origins. Configured per-environment via ALLOWED_ORIGINS env var.

implemented
🔒

Secrets Management

All secrets stored as environment variables on Render. .env and prometheus.yml excluded from git. Groq and Grafana tokens never committed.

implemented
🚀

CI/CD Pipeline

GitHub Actions triggers on push to main. Deploys API, Worker, and Frontend to Render in parallel via deploy hooks. Worker deploys after API succeeds.

implemented
📡

Observability

Prometheus scrapes /metrics every 60s. Remote write to Grafana Cloud. Dashboards for request rate, p95 latency, error rate, and endpoint breakdown.

implemented
RLFLOW — ARCHITECTURE DOCUMENT v2.0 — FULLY DEPLOYED