Architecture
🏛️ Project Architecture: SurfAI Last Updated: September 13, 2025
This document details the overall system architecture of the SurfAI service, the role of each component, and the main data flows.
1. Architecture Goals and Principles
- Decoupled Responsibilities: Frontend, backend, computation server, documentation, etc., are managed in independent repositories, aiming for clear separation of responsibilities.
- Serverless First: Where possible, we use serverless platforms (
Google Cloud Run) that do not require server management, building a cost-effective infrastructure that automatically scales up/down with traffic. - Containerized Standardization: Both frontend and backend are packaged as
Dockercontainers, ensuring consistency between development and production environments and maximizing deployment flexibility. - Security: All communication is encrypted with
HTTPS,Cloudflareprovides primary security (WAF,DDoSprotection), and the backend applies multi-layered security includingJWT,CSRF, and Role-Based Access Control (RBAC).
2. Overall System Diagram
3. Detailed Role of Each Component
A. Frontend - comfy-surfai-frontend-next
- Platform:
Google Cloud Run(Docker Container) - Domain:
surfai.org - Technology:
Next.js(App Router),TypeScript,Tailwind CSS,shadcn/ui - Core Role:
- Renders all UI (
Reactcomponents) displayed to the user. - Globally manages user login status via
AuthContext, operating based on tokens stored inHttpOnlycookies. - Handles all backend API requests centrally via
lib/apiClient.ts, including automatic reissuance logic when Access Tokens expire. - Connects to the backend's
WebSocketviahooks/useComfyWebSocket.tsto receive generation progress, results, etc., in real-time and reflect them in the UI.
- Renders all UI (
B. Backend - comfy-surfai-backend
- Platform:
Google Cloud Run(Docker Container) - Domain:
api.surfai.org - Technology:
NestJS,TypeORM,PostgreSQL,Passport.js - Core Role:
- Acts as a stateless API server that handles all business logic and serves as an API Gateway to other internal services.
- Authentication: Processes
Google Sign-Inand general login requests, generatesJWT(Access/Refresh Token) for authenticated users, and sets them asHttpOnlycookies on the client. Controls access to each API endpoint viaJwtAuthGuardandRolesGuard. - Coin Management: Manages user coin balances and records coin transactions. Provides manual coin adjustment functionality via admin APIs.
- Generation Pipeline: Forwards generation requests from the frontend to the
ComfyUIcomputation server and broadcasts progress to the frontend viaWebSocket. - LLM Feature Integration: Forwards LLM-related requests (general chat, RAG chat, etc.) from the frontend to the internal
comfy-langchainserver and returns the results. - File Management: Securely uploads and manages result files generated by
ComfyUIor PDF files uploaded by the user toCloudflare R2.
C. Computation Server
- Platform: Local PC or Cloud GPU Virtual Machine (
On-demand/Spot VM) - Technology:
ComfyUI - Core Role:
- Dedicated to performing heavy AI computations, receiving workflows and parameters from the backend.
- Sends
progress,executed, and other events occurring during generation to the backend viaWebSocket. - Securely exposed to the external internet using an Nginx Reverse Proxy, performing primary access control via Basic Authentication.
D. LLM Server - comfy-langchain
- Platform:
Google Cloud Run(Docker Container) - Technology:
FastAPI,Python,LangChain - Core Role:
- A Python-based API server that specializes in handling LLM (Large Language Model) related features using the
LangChainlibrary. - General Chat: Receives internal API requests from the NestJS backend, performs tasks such as text generation or summarization, and returns the results.
- RAG Pipeline: On request from the backend, it reads a PDF, splits it into text chunks, creates vector embeddings, and stores them in the
pgvectordatabase. For subsequent chat requests, it retrieves relevant text chunks frompgvectorand provides them to the LLM to generate a context-aware answer. - Maintains security by only allowing requests from the NestJS backend via an internal API key (
X-Internal-API-Key).
- A Python-based API server that specializes in handling LLM (Large Language Model) related features using the
E. Cloud Infrastructure
- Google Cloud Run: Runs frontend and backend
Dockercontainers, providing a serverless environment that automatically scales up/down with traffic. - PostgreSQL (by Supabase): Permanently stores structured data like users, workflows, generation history, and coin transactions. The
pgvectorextension is enabled to store and perform similarity searches on text embeddings for the RAG feature. - Cloudflare R2: Object storage for generated image/video files and user-uploaded PDF files for RAG.
- Cloudflare (Overall): Manages
DNSfor thesurfai.orgdomain and provides security and performance optimization features such asWAFandCDN.
F. Documentation System - surfai-docs
- Platform:
Vercel - Domain:
docs.surfai.org - Technology:
Docusaurus,React,Markdown(MDX) - Core Role:
- Serves as the Single Source of Truth providing all technical documentation, architecture, decision logs, etc., in a static website format.
- All documents are written as
Markdownfiles and version-controlled withGitHub. - A CI/CD pipeline is established through
Gitintegration withVercel, automatically building and deploying the site whenever changes are pushed to themainbranch. - Provides multilingual documentation (Korean, English, etc.) through i18n features.
4. Key Data Flow
A. User Authentication Flow (HttpOnly Cookie + JWT)
- Login Attempt: The frontend calls the
/api/auth/googleor/api/auth/loginAPI. - Authentication and Token Issuance: After verifying identity, the backend generates an Access Token (15 min) and a Refresh Token (2 days).
- Cookie Setting: The backend sets the issued tokens as
HttpOnly,Secure,SameSite=None(production environment) cookies in the browser via theSet-Cookieheader in the response. - API Request: Subsequently, the frontend's
apiClientautomatically includes cookies with all API requests via thecredentials: 'include'option. - Token Validation: The backend's
JwtAuthGuardauthenticates the user by validating theaccess_tokenin the request cookie. - Token Reissue: If the Access Token expires and a
401error occurs,apiClientautomatically calls the/api/auth/refreshAPI. The backend'sJwtRefreshGuardvalidates therefresh_tokencookie, and if successful, resets new tokens as cookies.
B. Coin Deduction and Generation Pipeline Flow
- Coin Deduction Request: When a user requests image generation from the frontend, the
POST /api/coin/deductAPI is called first to deduct coins. - Coin Deduction Processing: The backend checks the user's coin balance, and if sufficient, deducts coins and records the transaction. If the coin balance is insufficient, it returns an error.
- Generation Task Transfer (on successful coin deduction): If coin deduction is successful, the frontend calls the
POST /api/generateAPI to transfer the image generation task to the backend. - Task Processing: The backend receives the request, validates it, and forwards the task to the
ComfyUIcomputation server. - Real-time Feedback: The computation server sends
WebSocketevents (e.g.,progress) occurring during generation to the backend. The backend'sEventsGatewayreceives these messages and broadcasts them back to the frontend. - Result Processing: Once generation is complete (
executedmessage), the backend uploads the result file toR2and records it in theDB. - Final Notification: The backend sends final result information (DB ID, pre-signed URL for display, etc.) to the frontend as a
generation_resultWebSocketevent, causing the result to be displayed in theSessionGallery.