NestAI Docs

AI MODELS

NestAI runs AI models locally on your server via Ollama. You choose which models are installed — and your data never touches any external API.

Models available at setup

When you first deploy, you pick one of these four models. Your server is sized automatically to match your choice.

Model	Size	Best for	Server
Mistral 7B	4.1 GB	Speed, chat, drafting, support	cx43 (8GB RAM)
Llama 3.2 3B	2.0 GB	Writing, Q&A, summarisation	cx43 (8GB RAM)
DeepSeek R1 7B	4.7 GB	Code, reasoning, math, logic	cx43 (8GB RAM)
Phi-3 Mini	2.3 GB	Speed, lightweight, edge tasks	cx43 (8GB RAM)
Llama 3.1 70B ⚡	40 GB	Near GPT-4 quality, ultra fast	ccx33 (32GB RAM) +₹1,000/mo

ℹ

The Llama 3.1 70B (Ultra Fast) option requires a 32GB RAM dedicated server and costs ₹1,000/month extra on top of your plan price. It delivers near-instant responses for complex tasks.

Full model library (install anytime)

After deploying, go to Dashboard → Models to install additional models from this library.

Model	Size	Best for	Notes
Phi-3 Mini	2.3 GB	Fast general tasks	Fastest on standard server — good starting point
Llama 3.1 8B	4.7 GB	General use	Best balance of quality + speed
Llama 3.2 3B	2.0 GB	Fast + multilingual	Meta's compact model
Mistral 7B	4.1 GB	Code + structured output	European model. Strong at JSON/code
Mistral Nemo 12B	7.1 GB	Code + long context	128k context window
DeepSeek R1 7B	4.7 GB	Reasoning + analysis	Chain-of-thought reasoning built in
DeepSeek R1 14B	9.0 GB	Deep analysis	Near GPT-4 quality for analysis tasks
Qwen 2.5 7B	4.7 GB	Multilingual	Excellent Chinese + Asian language support
Code Llama 7B	3.8 GB	Software development	Code completion, review, debugging
Gemma 2 9B	5.5 GB	Fast + balanced	Google model — very fast inference
Nomic Embed	274 MB	Knowledge base / RAG	Required for document search features

ℹ

Your standard server has 8GB effective RAM (4GB + 4GB swap). Models up to 7B run comfortably. Models 9–14B are slower but work. The 70B Ultra Fast plan uses a dedicated 32GB server.

Installing a model

Go to Dashboard → Models. Browse the library and click Pull →. The download runs in the background — 2–10 minutes depending on model size. Refresh to see it in your installed list.

⚠

Do not power off or restart your server while a model is downloading. The pull will fail and you'll need to retry.

Removing a model

In the installed models list, click ✕ Remove next to the model. This frees up disk space. Your conversations using that model are not deleted — they just won't respond until you reinstall the model.

Disk space

Your server has a 40–160GB SSD depending on server type. Docker, system files, and Open WebUI use ~8GB. Rough guide for standard servers:

Models installed	Space used	Remaining
Phi-3 + Mistral	~7 GB	~25 GB free
3 models (7B each)	~14 GB	~18 GB free
5 mixed models	~25 GB	~7 GB free

✓

Install Nomic Embed Text (274MB) if you plan to use the Knowledge Base — it's required for document search.