NESTAIDOCS
NestAI Docs

AI MODELS

NestAI runs AI models locally on your server via Ollama. You choose which models are installed — and your data never touches any external API.

Models available at setup

When you first deploy, you pick one of these four models. Your server is sized automatically to match your choice.

ModelSizeBest forServer
Mistral 7B4.1 GBSpeed, chat, drafting, supportcx43 (8GB RAM)
Llama 3.2 3B2.0 GBWriting, Q&A, summarisationcx43 (8GB RAM)
DeepSeek R1 7B4.7 GBCode, reasoning, math, logiccx43 (8GB RAM)
Phi-3 Mini2.3 GBSpeed, lightweight, edge taskscx43 (8GB RAM)
Llama 3.1 70B ⚡40 GBNear GPT-4 quality, ultra fastccx33 (32GB RAM) +₹1,000/mo
The Llama 3.1 70B (Ultra Fast) option requires a 32GB RAM dedicated server and costs ₹1,000/month extra on top of your plan price. It delivers near-instant responses for complex tasks.

Full model library (install anytime)

After deploying, go to Dashboard → Models to install additional models from this library.

ModelSizeBest forNotes
Phi-3 Mini2.3 GBFast general tasksFastest on standard server — good starting point
Llama 3.1 8B4.7 GBGeneral useBest balance of quality + speed
Llama 3.2 3B2.0 GBFast + multilingualMeta's compact model
Mistral 7B4.1 GBCode + structured outputEuropean model. Strong at JSON/code
Mistral Nemo 12B7.1 GBCode + long context128k context window
DeepSeek R1 7B4.7 GBReasoning + analysisChain-of-thought reasoning built in
DeepSeek R1 14B9.0 GBDeep analysisNear GPT-4 quality for analysis tasks
Qwen 2.5 7B4.7 GBMultilingualExcellent Chinese + Asian language support
Code Llama 7B3.8 GBSoftware developmentCode completion, review, debugging
Gemma 2 9B5.5 GBFast + balancedGoogle model — very fast inference
Nomic Embed274 MBKnowledge base / RAGRequired for document search features
Your standard server has 8GB effective RAM (4GB + 4GB swap). Models up to 7B run comfortably. Models 9–14B are slower but work. The 70B Ultra Fast plan uses a dedicated 32GB server.

Installing a model

Go to Dashboard → Models. Browse the library and click Pull →. The download runs in the background — 2–10 minutes depending on model size. Refresh to see it in your installed list.

Do not power off or restart your server while a model is downloading. The pull will fail and you'll need to retry.

Removing a model

In the installed models list, click ✕ Remove next to the model. This frees up disk space. Your conversations using that model are not deleted — they just won't respond until you reinstall the model.

Disk space

Your server has a 40–160GB SSD depending on server type. Docker, system files, and Open WebUI use ~8GB. Rough guide for standard servers:

Models installedSpace usedRemaining
Phi-3 + Mistral~7 GB~25 GB free
3 models (7B each)~14 GB~18 GB free
5 mixed models~25 GB~7 GB free
Install Nomic Embed Text (274MB) if you plan to use the Knowledge Base — it's required for document search.