Home / Servers / AnythingLLM VPS

Self-host AnythingLLM on a no-KYC VPS

Chat with your PDFs, codebases, and knowledge bases via a private RAG workspace on your own server. Anonymous signup, no email required, no KYC. Pay with crypto or card, full root, no logs.

Deploy AnythingLLM VPS From $15.59/mo · 4 GB RAM minimum

Quick start: AnythingLLM on Servury via Docker

Tested on Ubuntu 24. Pick a 4 GB+ plan, deploy, SSH in.

# 1. Install Docker
curl -fsSL https://get.docker.com | sh

# 2. Run AnythingLLM (single-container, batteries included)
mkdir -p /opt/anythingllm/storage
docker run -d --name anythingllm --restart unless-stopped \
  -p 3001:3001 \
  --cap-add SYS_ADMIN \
  -v /opt/anythingllm/storage:/app/server/storage \
  -e STORAGE_DIR="/app/server/storage" \
  mintplexlabs/anythingllm

# 3. Open http://YOUR_SERVER_IP:3001, run setup wizard
#    Pick LLM provider (Ollama on this VPS, OpenAI, Anthropic, OpenRouter, etc.)
#    Pick embedding model and vector DB (LanceDB built-in is fine to start)

# 4. Create a workspace, upload PDFs / .docx / scrape a website / paste a YouTube link
# 5. Start chatting with citations

What AnythingLLM does well

Chat with any source

PDFs, Word, Markdown, websites, YouTube transcripts, GitHub repos, Confluence, Notion, plain text. Drop in, chat with citations.

Per-workspace context

Separate workspaces with isolated context. One for legal docs, one for codebase, one for personal. Switch with a click.

Multi-user with roles

Built-in user management, RBAC, per-user usage tracking. Share workspaces or keep them private.

Pluggable everything

Pick your LLM (Ollama/OpenAI/Anthropic/Groq/OpenRouter), pick your embeddings, pick your vector DB. Hot-swap without losing data.

Citations included

Answers cite the chunks they came from. You can verify what the model used. No silent hallucinations.

Self-host or air-gap

Pair with Ollama on the same VPS for zero-third-party AI. Documents, embeddings, model, all on your hardware.

Frequently asked questions

How much VPS do I need?

4 GB RAM is the sensible floor for the AnythingLLM container plus the embedding worker. If you also run a local LLM (Ollama with 7B-13B model) on the same VPS, you want 16 GB and possibly a GPU box. Splitting AnythingLLM and the LLM onto separate VPSes is also fine.

AnythingLLM vs Flowise?

AnythingLLM is a finished product with a chat UI and document workspaces. Flowise is a builder kit for arbitrary chatflows. If you want "drop docs and chat now," AnythingLLM. If you want to assemble custom logic, Flowise.

Can I expose it to my team?

Yes. Create users, assign roles, share workspaces. Put Caddy + auth in front for the extra paranoia.

Does it work without the cloud?

Yes. Use Ollama as the LLM, the bundled LanceDB as vector store, and the local embedding model. Zero outbound traffic, full air-gap.

How big a corpus can it handle?

Tens of thousands of documents per workspace is fine on a 4 GB VPS. For millions of docs, switch the vector store to Qdrant/Weaviate and bump RAM.

Can I integrate with Slack/Discord?

Yes, AnythingLLM has webhook and embed integrations, plus a public REST API to wire into anything else.

How do I back up workspaces?

rsync the storage dir on a schedule. The container is stateless; the storage dir holds everything.

Does Servury log my AnythingLLM traffic?

No. No application-level logging on customer servers. Anonymous signup, crypto/card payment, no logs.