Skip to content

AI Models

Preinstalled AI Models

Preinstalled AI models are located in the /mnt/data/ai-models folder. Everyone can read this folder but only users in the ai-models group can modify it.

For now the /mnt/data/ai-models folder contains three sub-folders:

  • gguf: Models in the GGUF format (typically used by llama.cpp).
  • ollama: Models downloaded from Ollama (generally derived from the GGUF format BUT only works with Ollama).
  • huggingface-snapshots: Models downloaded from Hugging Face, the format can differ depending on the repository.

The following sub-sections detail the models that are available depending on the three sub-folders.

GGUF Models

Model Name Alias of From First rel. date DL date Params (B) Context size (K) Model size on disk (GB) Input Output MoE Use cases and architecture Comments
deepdeek-r1:1.5b DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf DeepSeek 2025/01 2026/06 1.500 128 1.1 Text Text No Conversational LLM --
deepdeek-r1:7b DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf DeepSeek 2025/01 2026/06 7.000 128 4.4 Text Text No Conversational LLM --
deepdeek-r1:8b DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf DeepSeek 2025/01 2026/06 8.000 128 4.6 Text Text No Conversational LLM --
deepdeek-r1:14b DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf DeepSeek 2025/01 2026/06 14.000 128 8.4 Text Text No Conversational LLM --
deepseek-r1:32b DeepSeek-R1-Distill-Qwen-32B-Q4_K_M.gguf DeepSeek 2025/01 2026/06 32.000 128 19.0 Text Text No Conversational LLM --
deepseek-r1:70b DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf DeepSeek 2025/01 2026/06 70.000 128 40.0 Text Text No Conversational LLM --
devstral-small-2:24b Devstral-Small-2-24B-Instruct-2512-Q4_K_M.gguf Mistral AI 2025/12 2026/06 24.000 256 14.0 Text, Image Text No LLM for coding --
gemma4:e2b Gemma-4-E2B-it-Q4_K_M.gguf Google DeepMind 2026/04 2026/04 5.000 128 7.2 Text, Image, Video Text No Multimodal LLM --
gemma4:e4b Gemma-4-E4B-it-Q4_K_M.gguf Google DeepMind 2026/04 2026/04 8.000 128 9.6 Text, Image, Video Text No Multimodal LLM --
gemma4:12b Gemma-4-12B-it-Q4_K_M.gguf Google DeepMind 2026/06 2026/06 12.000 256 18.0 Text, Image, Video Text No Multimodal LLM --
gemma4:26b Gemma-4-26B-A4B-it-Q4_K_M.gguf Google DeepMind 2026/04 2026/04 26.000 256 18.0 Text, Image, Video Text Yes Multimodal LLM --
gemma4:31b Gemma-4-31B-it-Q4_K_M.gguf Google DeepMind 2026/04 2026/04 31.000 256 20.0 Text, Image, Video Text No Multimodal LLM --
gpt-oss:20b GPT-OSS-20B-A4B-Q4_K_M.gguf OpenAI 2025/08 2026/06 20.000 128 11.0 Text Text Yes Conversational LLM --
gpt-oss-mxfp4:20b GPT-OSS-20B-A4B-MXFP4.gguf OpenAI 2025/08 2026/06 20.000 128 12.0 Text Text Yes Conversational LLM --
llama2-q4_0:7b Llama-2-7B-Q4_0.gguf Meta 2023/07 2025/11 7.000 4 3.6 Text Text No Conversational LLM --
llama2:7b Llama-2-7B-Q4_K_M.gguf Meta 2023/07 2026/06 7.000 4 3.9 Text Text No Conversational LLM --
llama2:13b Llama-2-13B-Q4_K_M.gguf Meta 2023/07 2026/06 13.000 4 7.4 Text Text No Conversational LLM --
llama3.1:8b Llama-2-13B-Q4_K_M.gguf Meta 2024/07 2026/06 8.000 128 4.6 Text Text No Conversational LLM --
llama3.1:70b Llama-3.1-70B-Instruct-Q4_K_M.gguf Meta 2024/07 2026/06 70.000 128 40.0 Text Text No Conversational LLM --
llama3.2:1b Llama-3.2-1B-Instruct-Q4_K_M.gguf Meta 2024/09 2026/06 1.000 128 0.8 Text Text No Conversational LLM --
llama3.2:3b Llama-3.1-70B-Instruct-Q4_K_M.gguf Meta 2024/09 2026/06 3.000 128 1.9 Text Text No Conversational LLM --
ministral3:3b Ministral-3-3B-Instruct-2512-Q4_K_M.gguf Mistral AI 2025/05 2026/06 3.000 256 2.0 Text, Image Text No Multimodal LLM --
ministral3:8b Ministral-3-8B-Instruct-2512-Q4_K_M.gguf Mistral AI 2025/05 2026/06 8.000 256 4.9 Text, Image Text No Multimodal LLM --
ministral3:14b Ministral-3-14B-Instruct-2512-Q4_K_M.gguf Mistral AI 2025/05 2026/06 14.000 256 7.7 Text, Image Text No Multimodal LLM --
mistral-small3.2:24b Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M.gguf Mistral AI 2025/06 2026/06 14.000 128 14.0 Text, Image Text No Multimodal LLM --
qwen3.5:0.8b Qwen3.5-0.8B-Q4_K_M.gguf Alibaba Cloud 2026/02 2026/06 0.800 256 0.5 Text, Image Text No Multimodal LLM --
qwen3.5:2b Qwen3.5-2B-Q4_K_M.gguf Alibaba Cloud 2026/02 2026/06 2.000 256 1.2 Text, Image Text No Multimodal LLM --
qwen3.5:4b Qwen3.5-4B-Q4_K_M.gguf Alibaba Cloud 2026/02 2026/06 4.000 256 2.6 Text, Image Text No Multimodal LLM --
qwen3.5:9b Qwen3.5-9B-Q4_K_M.gguf Alibaba Cloud 2026/02 2026/06 9.000 256 5.3 Text, Image Text No Multimodal LLM --
qwen3.5:27b Qwen3.5-27B-Q4_K_M.gguf Alibaba Cloud 2026/02 2026/06 27.000 256 16.0 Text, Image Text No Multimodal LLM --
qwen3.5:35b Qwen3.5-35B-A3B-Q4_K_M.gguf Alibaba Cloud 2026/02 2026/06 35.000 256 21.0 Text, Image Text Yes Multimodal LLM --
qwen3.6:27b Qwen3.6-27B-Q4_K_M.gguf Alibaba Cloud 2026/04 2026/04 27.000 256 16.0 Text, Image Text No Multimodal LLM --
qwen3.6:35b Qwen3.6-35B-A3B-UD-Q4-K-M.gguf Alibaba Cloud 2026/04 2026/04 35.000 256 21.0 Text, Image Text Yes Multimodal LLM --

Ollama Models

Model Name Alias of From First rel. date DL date Params (B) Context size (K) Model size on disk (GB) Input Output MoE Use cases and architecture Comments
codellama:13b codellama:13b-instruct-q4_0 Meta 2023/08 2026/02 13.000 16 7.4 Text Text No LLM for coding --
codellama:34b codellama:34b-instruct-q4_0 Meta 2023/08 2026/02 34.000 16 19.0 Text Text No LLM for coding --
deepseek-coder-v2:16b deepseek-coder-v2:16b-lite-instruct-q4_0 Deepseek 2024/07 2026/02 16.000 160 8.9 Text Text Yes LLM for coding --
deepseek-r1:1.5b deepseek-r1:1.5b-qwen-distill-q4_K_M Deepseek 2025/01 2026/02 1.500 128 1.1 Text Text No Conversational LLM --
deepseek-r1:7b deepseek-r1:7b-qwen-distill-q4_K_M Deepseek 2025/01 2026/02 7.000 128 4.7 Text Text No Conversational LLM --
deepseek-r1:8b deepseek-r1:8b-0528-qwen3-q4_K_M Deepseek 2025/01 2026/02 8.000 128 5.2 Text Text No Conversational LLM --
deepseek-r1:14b deepseek-r1:14b-qwen-distill-q4_K_M Deepseek 2025/01 2025/10 14.800 128 9.0 Text Text No Conversational LLM --
deepseek-r1:32b deepseek-r1:32b-qwen-distill-q4_K_M Deepseek 2025/01 2026/02 32.000 128 20.0 Text Text No Conversational LLM --
deepseek-r1:70b deepseek-r1:70b-llama-distill-q4_K_M Deepseek 2025/01 2026/02 70.000 128 43.0 Text Text No Conversational LLM --
devstral-small-2:24b devstral-small-2:24b-instruct-2512-q4_K_M Mistral AI 2025/12 2026/01 24.000 256 15.0 Text, Image Text No LLM for coding Incompatible with Ollama v0.9.3+IPEX-LLM
aiasistentworld/ERNIE-4.5-21B-A3B-Thinking-LLM:latest Q4_K_M Baidu 2025/06 2026/02 21.800 128 13.0 Text Text Yes Conversational LLM Incompatible with Ollama v0.9.3+IPEX-LLM
gemma3:270m gemma3:270m-it-q8_0 Google DeepMind 2025/03 2026/02 0.270 32 0.3 Text Text No Conversational LLM Requires Ollama 0.6 or later
gemma3:1b gemma3:1b-it-q4_K_M Google DeepMind 2025/03 2026/02 1.000 32 0.8 Text Text No Conversational LLM Requires Ollama 0.6 or later
gemma3:4b gemma3:4b-it-q4_K_M Google DeepMind 2025/03 2026/02 4.000 128 3.3 Text, Image Text No Conversational LLM Requires Ollama 0.6 or later
gemma3:12b gemma3:12b-it-q4_K_M Google DeepMind 2025/03 2026/02 12.000 128 8.1 Text, Image Text No Conversational LLM Requires Ollama 0.6 or later
gemma3:27b gemma3:27b-it-q4_K_M Google DeepMind 2025/03 2026/02 27.000 128 17.0 Text, Image Text No Conversational LLM Requires Ollama 0.6 or later
gemma4:e2b gemma4:e2b-it-q4_K_M Google DeepMind 2026/04 2026/04 5.000 128 7.2 Text, Image, Video Text Yes Multimodal LLM Requires Ollama 0.20.0 or later
gemma4:e4b gemma4:e4b-it-q4_K_M Google DeepMind 2026/04 2026/04 8.000 128 9.6 Text, Image, Video Text Yes Multimodal LLM Requires Ollama 0.20.0 or later
gemma4:26b gemma4:26b-a4b-it-q4_K_M Google DeepMind 2026/04 2026/04 26.000 256 18.0 Text, Image, Video Text Yes Multimodal LLM Requires Ollama 0.20.0 or later
gemma4:31b gemma4:31b-it-q4_K_M Google DeepMind 2026/04 2026/04 31.000 256 20.0 Text, Image, Video Text No Multimodal LLM Requires Ollama 0.20.0 or later
glm4:9b glm4:9b-chat-q4_0 Zhipu AI 2024/06 2025/10 9.000 128 5.5 Text Text No Conversational LLM Requires Ollama 0.2 or later
glm-4.7-flash:q4_K_M -- Zhipu AI 2026/01 2025/10 30.000 198 19.0 Text Text Yes Conversational LLM Requires Ollama 0.14.3 or later
glm-4.7-flash:q8_0 -- Zhipu AI 2026/01 2025/10 30.000 198 32.0 Text Text Yes Conversational LLM Requires Ollama 0.14.3 or later
glm-4.7-flash:bf16 -- Zhipu AI 2026/01 2025/10 30.000 198 60.0 Text Text Yes Conversational LLM Requires Ollama 0.14.3 or later
gpt-oss:20b -- OpenAI 2025/08 2025/10 20.900 128 14.0 Text Text Yes Conversational LLM Incompatible with Ollama v0.9.3+IPEX-LLM
gpt-oss:120b -- OpenAI 2025/08 2025/10 120.000 128 65.0 Text Text Yes Conversational LLM Incompatible with Ollama v0.9.3+IPEX-LLM
granite4:350m granite4:350m-bf16 IBM 2025/11 2026/02 0.350 32 0.7 Text Text No Conversational LLM --
granite4:350m-h granite4:350m-h-q8_0 IBM 2025/11 2026/02 0.350 32 0.4 Text Text Yes Conversational LLM --
granite4:1b granite4:1b-bf16 IBM 2025/11 2026/02 1.000 128 3.3 Text Text No Conversational LLM --
granite4:1b-h granite4:1b-h-q8_0 IBM 2025/11 2026/02 1.000 1000000 1.6 Text Text Yes Conversational LLM --
granite4:3b granite4:micro (Q4_K_M) IBM 2025/11 2026/02 3.000 128 2.1 Text Text No Conversational LLM --
granite4:3b-h granite4:micro-h (Q4_K_M) IBM 2025/11 2026/02 3.000 1000000 1.9 Text Text Yes Conversational LLM --
granite4:7b-a1b-h granite4:tiny-h (Q4_K_M) IBM 2025/11 2026/02 7.000 1000000 4.2 Text Text Yes Conversational LLM --
granite4:32b-a9b-h granite4:small-h (Q4_K_M) IBM 2025/11 2026/02 32.000 1000000 19.0 Text Text Yes Conversational LLM --
internlm2.5:1.8b-chat -- Shanghai AI Laboratory 2024/07 2025/02 1.800 32 3.8 Text Text No Conversational LLM --
internlm2.5:7b-chat -- Shanghai AI Laboratory 2024/07 2025/02 7.000 32 15.0 Text Text No Conversational LLM --
internlm2.5:7b-chat-1m -- Shanghai AI Laboratory 2024/07 2025/02 7.000 256 15.0 Text Text No Conversational LLM --
internlm2.5:20b-chat -- Shanghai AI Laboratory 2024/07 2025/02 20.000 32 40.0 Text Text No Conversational LLM --
internlm3-8b-instruct -- Shanghai AI Laboratory 2025/01 2025/02 8.000 32 18.0 Text Text No Conversational LLM --
llama2:7b llama2:7b-chat-q4_0 Meta 2023/02 2025/02 7.000 4 3.8 Text Text No Conversational LLM --
llama2:13b llama2:13b-chat-q4_0 Meta 2023/02 2025/02 13.000 4 7.4 Text Text No Conversational LLM --
llama2:70b llama2:70b-chat-q4_0 Meta 2023/02 2025/02 70.000 4 39.0 Text Text No Conversational LLM --
llama3.1:8b llama3.1:8b-instruct-q4_K_M Meta 2024/07 2025/02 8.000 128 4.9 Text Text No Conversational LLM --
llama3.1:70b llama3.1:70b-instruct-q4_K_M Meta 2024/07 2025/02 70.000 128 43.0 Text Text No Conversational LLM --
llama3.2:1b llama3.2:1b-instruct-q8_0 Meta 2024/09 2025/02 1.000 128 1.3 Text Text No Conversational LLM --
llama3.2:3b llama3.2:3b-instruct-q4_K_M Meta 2024/09 2025/02 3.000 128 2.0 Text Text No Conversational LLM --
llava:13b llava:13b-v1.6-vicuna-q4_0 Microsoft Research 2023/10 2026/02 13.000 4 8.0 Text, Image Text No Multimodal LLM --
llava:34b llava:34b-v1.6-q4_0 Microsoft Research 2023/10 2026/02 34.000 4 20.0 Text, Image Text No Multimodal LLM --
llava-llama3:8b llava-llama3:8b-v1.1-q4_0 Microsoft Research 2024/04 2026/02 8.000 8 5.5 Text, Image Text No Multimodal LLM --
mistral:7b mistral:7b-instruct-v0.3-q4_K_M Mistral AI 2023/09 2026/03 7.000 32 4.4 Text Text No Conversational LLM --
mistral-small3.2:24b mistral-small3.2:24b-instruct-2506-q4_K_M Mistral AI 2025/06 2026/01 24.000 128 15.0 Text, Image Text No Multimodal LLM --
mistral-nemo mistral-nemo:12b-instruct-2407-q4_0 Mistral AI 2024/07 2026/03 12.000 1000 7.1 Text Text No Conversational LLM --
mixtral:8x7b mixtral:8x7b-instruct-v0.1-q4_0 Mistral AI 2023/12 2026/01 57.000 32 26.0 Text Text Yes Conversational LLM --
mixtral:8x22b mixtral:8x22b-instruct-v0.1-q4_0 Mistral AI 2023/12 2025/10 140.600 64 80.0 Text Text Yes Conversational LLM --
nomic-embed-text-v2-moe -- Nomic AI 2025/02 2026/01 0.305 512 1.0 Text Text Yes LLM for multilingual retrieval --
olmo-3:7b olmo-3:7b-think-q4_K_M Allen AI 2025/11 2026/02 7.000 64 4.5 Text Text No Conversational LLM Incompatible with Ollama v0.9.3+IPEX-LLM
olmo-3:32b olmo-3:32b-think-q4_K_M Allen AI 2025/11 2026/02 32.000 64 19.0 Text Text No Conversational LLM Incompatible with Ollama v0.9.3+IPEX-LLM
olmo-3.1:32b olmo-3.1:32b-think-q4_K_M Allen AI 2025/12 2026/02 32.000 64 19.0 Text Text No Conversational LLM Incompatible with Ollama v0.9.3+IPEX-LLM
olmo-3.1:32b-instruct olmo-3.1:32b-instruct-q4_K_M Allen AI 2025/12 2026/02 32.000 64 19.0 Text Text No Conversational LLM Incompatible with Ollama v0.9.3+IPEX-LLM
phi4:14b phi4:14b-q4_K_M Microsoft 2025/01 2026/02 14.000 16 9.1 Text Text No Conversational LLM --
phi4-mini:3.8b phi4-mini:3.8b-q4_K_M Microsoft 2025/01 2026/02 3.800 128 2.5 Text Text No Conversational LLM --
phi4-reasoning:14b phi4-reasoning:14b-q4_K_M Microsoft 2025/04 2026/02 14.000 16 11.0 Text Text No Conversational LLM --
phi4-mini-reasoning:3.8b phi4-mini-reasoning:3.8b-q4_K_M Microsoft 2025/01 2026/02 3.800 128 3.2 Text Text No Conversational LLM --
qwen2.5:0.5b qwen2.5:0.5b-instruct-q4_K_M Alibaba Cloud 2024/09 2026/02 0.500 32 0.4 Text Text No Conversational LLM --
qwen2.5:1.5b qwen2.5:1.5b-instruct-q4_K_M Alibaba Cloud 2024/09 2026/02 1.500 32 1.0 Text Text No Conversational LLM --
qwen2.5:3b qwen2.5:3b-instruct-q4_K_M Alibaba Cloud 2024/09 2026/02 3.000 32 1.9 Text Text No Conversational LLM --
qwen2.5:7b qwen2.5:7b-instruct-q4_K_M Alibaba Cloud 2024/09 2026/02 7.000 32 4.7 Text Text No Conversational LLM --
qwen2.5:14b qwen2.5:14b-instruct-q4_K_M Alibaba Cloud 2024/09 2026/02 14.000 32 9.0 Text Text No Conversational LLM --
qwen2.5:32b qwen2.5:32b-instruct-q4_K_M Alibaba Cloud 2024/09 2026/02 32.000 32 20.0 Text Text No Conversational LLM --
qwen2.5:72b qwen2.5:72b-instruct-q4_K_M Alibaba Cloud 2024/09 2026/02 72.000 32 47.0 Text Text No Conversational LLM --
qwen2.5vl:7b qwen2.5vl:7b-q4_K_M Alibaba Cloud 2024/12 2026/02 32.000 125 6.0 Text, Image Text No Multimodal LLM --
qwen2.5vl:32b qwen2.5vl:32b-q4_K_M Alibaba Cloud 2024/12 2026/02 32.000 125 21.0 Text, Image Text No Multimodal LLM --
qwen3:0.6b qwen3:0.6b-q4_K_M Alibaba Cloud 2025/04 2025/10 0.600 40 0.5 Text Text No Conversational LLM --
qwen3:1.7b qwen3:1.7b-q4_K_M Alibaba Cloud 2025/04 2025/10 1.700 40 1.4 Text Text No Conversational LLM --
qwen3:4b qwen3:4b-q4_K_M Alibaba Cloud 2025/04 2025/10 4.000 256 2.5 Text Text No Conversational LLM --
qwen3:8b qwen3:4b-thinking-2507-q4_K_M Alibaba Cloud 2025/04 2025/10 8.000 40 5.2 Text Text No Conversational LLM --
qwen3:14b qwen3:14b-thinking-2507-q4_K_M Alibaba Cloud 2025/04 2025/10 14.000 40 9.3 Text Text No Conversational LLM --
qwen3:30b qwen3:30b-a3b-thinking-2507-q4_K_M Alibaba Cloud 2025/04 2025/10 30.500 256 19.0 Text Text Yes Conversational LLM --
qwen3:32b qwen3:32b-q4_K_M Alibaba Cloud 2025/04 2026/02 32.000 40 20.0 Text Text No Conversational LLM --
qwen3-coder:30b qwen3-coder:30b-a3b-q4_K_M Alibaba Cloud 2025/08 2025/10 30.500 256 19.0 Text Text Yes LLM for coding --
qwen3-coder-next:latest qwen3-coder-next:q4_K_M Alibaba Cloud 2026/02 2026/03 80.000 256 52.0 Text Text Yes LLM for coding --
qwen3-vl:2b qwen3-vl:2b-thinking-q4_K_M Alibaba Cloud 2025/10 2026/02 2.000 256 1.9 Text, Image Text No Multimodal LLM Incompatible with Ollama v0.9.3+IPEX-LLM
qwen3-vl:4b qwen3-vl:4b-thinking-q4_K_M Alibaba Cloud 2025/10 2026/02 4.000 256 3.3 Text, Image Text No Multimodal LLM Incompatible with Ollama v0.9.3+IPEX-LLM
qwen3-vl:8b qwen3-vl:8b-thinking-q4_K_M Alibaba Cloud 2025/10 2026/02 8.000 256 6.1 Text, Image Text No Multimodal LLM Incompatible with Ollama v0.9.3+IPEX-LLM
qwen3-vl:30b qwen3-vl:30b-a3b-thinking-q4_K_M Alibaba Cloud 2025/10 2026/02 30.000 256 20.0 Text, Image Text Yes Multimodal LLM Incompatible with Ollama v0.9.3+IPEX-LLM
qwen3-vl:32b qwen3-vl:32b-thinking-q4_K_M Alibaba Cloud 2025/10 2026/02 32.000 256 21.0 Text, Image Text No Multimodal LLM Incompatible with Ollama v0.9.3+IPEX-LLM
qwen3.5:0.8b qwen3.5:0.8b-q8_0 Alibaba Cloud 2026/02 2026/03 0.800 256 1.0 Text, Image Text No Multimodal LLM Requires Ollama 0.17.4 or later
qwen3.5:2b qwen3.5:2b-q8_0 Alibaba Cloud 2026/02 2026/03 2.000 256 2.7 Text, Image Text No Multimodal LLM Requires Ollama 0.17.4 or later
qwen3.5:4b qwen3.5:4b-q4_K_M Alibaba Cloud 2026/02 2026/03 4.000 256 3.4 Text, Image Text No Multimodal LLM Requires Ollama 0.17.4 or later
qwen3.5:9b qwen3.5:9b-q4_K_M Alibaba Cloud 2026/02 2026/03 9.000 256 6.6 Text, Image Text No Multimodal LLM Requires Ollama 0.17.4 or later
qwen3.5:27b qwen3.5:27b-q4_K_M Alibaba Cloud 2026/02 2026/03 27.000 256 17.0 Text, Image Text No Multimodal LLM Requires Ollama 0.17.4 or later
qwen3.5:35b qwen3.5:35b-a3b-q4_K_M Alibaba Cloud 2026/02 2026/03 35.000 256 24.0 Text, Image Text Yes Multimodal LLM Requires Ollama 0.17.4 or later
qwen3.5:122b qwen3.5:122b-a10b-q4_K_M Alibaba Cloud 2026/02 2026/03 122.000 256 81.0 Text, Image Text Yes Multimodal LLM Requires Ollama 0.17.4 or later
qwen3.6:27b qwen3.6:27b-q4_K_M Alibaba Cloud 2026/04 2026/04 27.000 256 16.0 Text, Image Text No Multimodal LLM --
qwen3.6:35b qwen3.6:35b-a3b-q4_K_M Alibaba Cloud 2026/04 2026/04 35.000 256 23.0 Text, Image Text Yes Multimodal LLM --

Hugging Face Models

Model Name From First rel. date DL date Params (B) Context size (K) Model size on disk (GB) Input Output MoE Use cases and architecture Comments
donut-base NAVER Labs AI 2021/11 2026/01 0.250 -- 0.8 Text, Image, PDF Text No Document understanding (transformer enc-dec) OCR-free
layoutlmv2-base-uncased Microsoft Research Asia 2020/12 2026/01 0.200 -- 0.8 Text, Image Text No Document understanding (transformer enc-only) With OCR
layoutlmv3-base Microsoft Research Asia 2022/04 2026/01 0.100 -- 1.9 Text, Image, PDF Text No Document understanding (transformer enc-only) With OCR
roberta-base-squad2 Deepset 2023/06 2026/01 0.100 -- 2.4 Text Text No Extractive QA (transformer enc-only) --
distilbert-base-cased Hugging Face 2019/09 2026/01 0.065 -- 1.1 Text Text No Extractive QA (transformer enc-only) --
bart-large-cnn Facebook AI 2019/10 2026/01 0.400 -- 8.0 Text Text No Text summary (transformer enc-dec) --
pegasus-cnn_dailymail Google Research 2018/12 2026/01 7.000 -- 5.0 Text Text No Text summary (transformer enc-dec) --
t5-base Google Research 2018/05 2026/01 0.200 -- 4.2 Text Text No Text summary (transformer enc-dec) --
PP-OCRv5_server_det PaddleOCR Team, Baidu 2025/09 2026/01 0.100 -- 0.1 Image, PDF Text, Bounding boxes No Multimod CNN + transformer (txt detec + recog) OCR to raw text
idefics2-8b Hugging Face 2024/04 2026/01 0.100 -- 32.0 Text, Image Text No Vision + language with summary and analysis --
Segment-Anything-Model-2 Meta 2024/07 2026/01 0.033 -- 0.1 Image, Video Image, Video No Vision-only and segmentation --
gpt-oss-20b OpenAI 2025/08 2026/02 20.900 128 14.0 Text Text Yes Conversational LLM Corrupted
Qwen2.5-VL-72B-Instruct Alibaba Cloud 2024/09 2026/01 73.000 125 137.0 Text, Image Text No Multimodal LLM --
Qwen2.5-VL-72B-Instruct-FP8-dynamic Alibaba Cloud 2024/09 2026/01 73.000 125 72.0 Text, Image Text No Multimodal LLM --
Llama-3.2-90B-Vision-Instruct-FP8-dynamic Meta 2024/09 2026/01 89.000 128 86.0 Text, Image Text No Multimodal LLM --
FLUX.1-dev Black Forest Labs 2024/08 2026/02 12.000 -- 54.0 Text Image No Image gen (transformer + diffusion => FLUX) --
FLUX.2-klein-9B Black Forest Labs 2025/11 2026/02 9.000 40 50.0 Text, Image Image No Image gen (transformer + diffusion => FLUX) Should work on RTX 4090 (~29 GB VRAM)
FLUX.2-dev Black Forest Labs 2025/11 2026/02 32.000 -- 166.0 Text, Image Image No Image gen (transformer + diffusion => FLUX) --
FLUX.2-dev-bnb-4bit Black Forest Labs 2025/11 2026/02 32.000 -- 32.0 Text, Image Image No Image gen (transformer + diffusion => FLUX) Should work on RTX 4090 (~18 GB VRAM)

Technical Details about ai-models Group

For users in the ai-models group, it has been ensured that created files and folders will have the ai-models group by default. For this, the setgid bit has been added on /mnt/data/ai-models and sub-folders:

find /mnt/data/ai-models -type d -exec sudo chmod g+s {} +

Then, still in the /mnt/data/ai-models folder, the default group rights have been updated to force rwx on new created folders and rw on new created files:

# install ACL to have the `setfacl` command
sudo apt install acl
# apply ACL to existing files
find /mnt/data/ai-models -type d -exec sudo setfacl -m g:ai-models:rwx {} +
find /mnt/data/ai-models -type f -exec sudo setfacl -m g:ai-models:rw- {} +
# apply ACL to the future files
sudo setfacl -R -d -m g:ai-models:rwx /mnt/data/ai-models