Home

EVE Backend

A FastAPI-based backend service for chat with Retrieval-Augmented Generation (RAG). It provides authentication, collections and document ingestion, conversation/message management, streaming responses, and a hallucination detection pipeline.

For setup and running the backend, see:

Local development & configuration — prerequisites, environment variables, and running the backend directly on your machine
Docker setup — run the backend with Docker Compose

For additional deployment details, you can also refer to README.md.

Architecture

FastAPI: HTTP API and dependency injection
MongoDB/DocumentDB: Primary datastore for users, collections, documents, conversations, messages
Qdrant: Vector store for embeddings and retrieval
LLM providers: Pluggable via src/core/llm_manager.py
Docs: MkDocs Material + mkdocstrings (Google-style docstrings)

Directory structure

src/
  routers/            # FastAPI route handlers (auth, collection, document, conversation, message, tool, user, health)
  services/           # Business logic (auth, generate_answer, email, hallucination, etc.)
  database/           # ODM-style models and pagination helpers
  core/               # Vector store and LLM managers
  middlewares/        # Authentication dependencies
  schemas/            # Pydantic request/response models
  templates/          # Prompt templates and pipeline configs
  utils/              # Helpers, parsers, embeddings, rerankers
tests/                # API and domain tests
docs/                 # Site content (this page, api references)

Key workflows

Authentication
- Signup, email activation, login, refresh
- Endpoints in routers.auth and routers.forgot_password
Collections & Documents
- Create Qdrant collections, upload documents, delete documents
- Ingestion triggers parsing, chunking, embedding, and vector upsert
- Endpoints in routers.collection and routers.document
Conversations & Messages
- Create conversations, post messages, stream responses (SSE)
- Retry message generation, update feedback/annotations
- Endpoints in routers.conversation and routers.message
Hallucination Detection
- Synchronous detection and streaming modes
- Annotates message metadata with label, reason, timings

API surface (reference)

Auth: [routers-auth] — routers.auth
Collections: [routers-collection] — routers.collection
Documents: [routers-document] — routers.document
Conversations: [routers-conversation] — routers.conversation
Messages: [routers-message] — routers.message
Users: [routers-user] — routers.user
Health: [routers-health_check] — routers.health_check

Message generation flow (high-level)

Validate conversation ownership and requested collections
Expand requested collections with allowed public and user-owned collections
Optionally extract year range from filters for MCP usage
If starting a new chat, create Conversation first; then create placeholder Message record
Run the answer generation pipeline:
- Build context (RAG decision, retrieval, reranking)
- Generate answer from LLM and record timings and prompt metadata
Update Message output (answer), documents, flags, and latencies
Optionally schedule rollup/trim in background
For streaming endpoints, publish tokens and lifecycle events via bus

Documentation notes

Docstrings are Google-style with Args:, Returns:, Raises:.
Module reference pages in docs/ use:

::: routers.<module>