Home
EVE Backend
A FastAPI-based backend service for chat with Retrieval-Augmented Generation (RAG). It provides authentication, collections and document ingestion, conversation/message management, streaming responses, and a hallucination detection pipeline.
For setup and running the backend, see:
- Local development & configuration — prerequisites, environment variables, and running the backend directly on your machine
- Docker setup — run the backend with Docker Compose
For additional deployment details, you can also refer to README.md.
Architecture
- FastAPI: HTTP API and dependency injection
- MongoDB/DocumentDB: Primary datastore for users, collections, documents, conversations, messages
- Qdrant: Vector store for embeddings and retrieval
- LLM providers: Pluggable via
src/core/llm_manager.py - Docs: MkDocs Material + mkdocstrings (Google-style docstrings)
Directory structure
src/
routers/ # FastAPI route handlers (auth, collection, document, conversation, message, tool, user, health)
services/ # Business logic (auth, generate_answer, email, hallucination, etc.)
database/ # ODM-style models and pagination helpers
core/ # Vector store and LLM managers
middlewares/ # Authentication dependencies
schemas/ # Pydantic request/response models
templates/ # Prompt templates and pipeline configs
utils/ # Helpers, parsers, embeddings, rerankers
tests/ # API and domain tests
docs/ # Site content (this page, api references)
Key workflows
- Authentication
- Signup, email activation, login, refresh
- Endpoints in
routers.authandrouters.forgot_password
- Collections & Documents
- Create Qdrant collections, upload documents, delete documents
- Ingestion triggers parsing, chunking, embedding, and vector upsert
- Endpoints in
routers.collectionandrouters.document
- Conversations & Messages
- Create conversations, post messages, stream responses (SSE)
- Retry message generation, update feedback/annotations
- Endpoints in
routers.conversationandrouters.message
- Hallucination Detection
- Synchronous detection and streaming modes
- Annotates message metadata with label, reason, timings
API guides (usage-first)
- Auth (working examples):
[routers-auth] - Collections (public/private + examples):
[routers-collection] - Documents (ingestion + examples):
[routers-document] - Conversations (chat lifecycle + examples):
[routers-conversation] - Messages (generation/streaming + examples):
[routers-message]
Use [swagger-api] only when you need exhaustive field-level reference.
Shared API setup
Use this once and reuse it across all API examples:
import requests
BASE_URL = "http://localhost:8000"
# Auth flow:
# 1) signup -> verify -> login
# 2) set ACCESS_TOKEN from login response
ACCESS_TOKEN = "<your_access_token>"
headers = {"Authorization": f"Bearer {ACCESS_TOKEN}"}
Recommended API call order
- Auth first
POST /signupPOST /verifyPOST /login-> getaccess_token/refresh_token
- Collection discovery
GET /collections/publicto get validpublic_collectionsnames for generation
- Conversation lifecycle
POST /conversationsto getconversation_id
- Message generation
POST /conversations/{conversation_id}/messagesor/stream_messages
- Optional ingestion for private retrieval
POST /collections-> get privatecollection_idPOST /collections/{collection_id}/documents
- Optional advanced/ops routes
.../retry,.../hallucination,/generate,/retrieve, stats endpoints
API dependency notes
- Message endpoints require a valid
conversation_id. - Generation endpoints require valid collection names (
public_collections) from collection listing. - Document endpoints require a valid private
collection_id. - Most non-auth endpoints require
Authorization: Bearer <access_token>.
Message generation flow (high-level)
- Validate conversation ownership and requested collections
- Expand requested collections with allowed public and user-owned collections
- Optionally extract year range from filters for MCP usage
- If starting a new chat, create
Conversationfirst; then create placeholderMessagerecord - Run the answer generation pipeline:
- Build context (RAG decision, retrieval, reranking)
- Generate answer from LLM and record timings and prompt metadata
- Update
Messageoutput (answer), documents, flags, and latencies - Optionally schedule rollup/trim in background
- For streaming endpoints, publish tokens and lifecycle events via bus
Documentation notes
- Docstrings are Google-style with
Args:,Returns:,Raises:. - Module reference pages in
docs/use:
::: routers.<module>