AI Models¶
Preinstalled AI Models¶
Preinstalled AI models are located in the /mnt/data/ai-models folder.
Everyone can read this folder but only users in the ai-models group can modify
it.
For now the /mnt/data/ai-models folder contains three sub-folders:
gguf: Models in the GGUF format (typically used byllama.cpp).ollama: Models downloaded from Ollama (generally derived from the GGUF format BUT only works with Ollama).huggingface-snapshots: Models downloaded from Hugging Face, the format can differ depending on the repository.
The following sub-sections detail the models that are available depending on the three sub-folders.
GGUF Models¶
| Model Name | Alias of | From | First rel. date | DL date | Params (B) | Context size (K) | Model size on disk (GB) | Input | Output | MoE | Use cases and architecture | Comments |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
deepdeek-r1:1.5b |
DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf |
DeepSeek | 2025/01 | 2026/06 | 1.500 | 128 | 1.1 | Text | Text | No | Conversational LLM | -- |
deepdeek-r1:7b |
DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf |
DeepSeek | 2025/01 | 2026/06 | 7.000 | 128 | 4.4 | Text | Text | No | Conversational LLM | -- |
deepdeek-r1:8b |
DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf |
DeepSeek | 2025/01 | 2026/06 | 8.000 | 128 | 4.6 | Text | Text | No | Conversational LLM | -- |
deepdeek-r1:14b |
DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf |
DeepSeek | 2025/01 | 2026/06 | 14.000 | 128 | 8.4 | Text | Text | No | Conversational LLM | -- |
deepseek-r1:32b |
DeepSeek-R1-Distill-Qwen-32B-Q4_K_M.gguf |
DeepSeek | 2025/01 | 2026/06 | 32.000 | 128 | 19.0 | Text | Text | No | Conversational LLM | -- |
deepseek-r1:70b |
DeepSeek-R1-Distill-Llama-70B-Q4_K_M.gguf |
DeepSeek | 2025/01 | 2026/06 | 70.000 | 128 | 40.0 | Text | Text | No | Conversational LLM | -- |
devstral-small-2:24b |
Devstral-Small-2-24B-Instruct-2512-Q4_K_M.gguf |
Mistral AI | 2025/12 | 2026/06 | 24.000 | 256 | 14.0 | Text, Image | Text | No | LLM for coding | -- |
gemma4:e2b |
Gemma-4-E2B-it-Q4_K_M.gguf |
Google DeepMind | 2026/04 | 2026/04 | 5.000 | 128 | 7.2 | Text, Image, Video | Text | No | Multimodal LLM | -- |
gemma4:e4b |
Gemma-4-E4B-it-Q4_K_M.gguf |
Google DeepMind | 2026/04 | 2026/04 | 8.000 | 128 | 9.6 | Text, Image, Video | Text | No | Multimodal LLM | -- |
gemma4:12b |
Gemma-4-12B-it-Q4_K_M.gguf |
Google DeepMind | 2026/06 | 2026/06 | 12.000 | 256 | 18.0 | Text, Image, Video | Text | No | Multimodal LLM | -- |
gemma4:26b |
Gemma-4-26B-A4B-it-Q4_K_M.gguf |
Google DeepMind | 2026/04 | 2026/04 | 26.000 | 256 | 18.0 | Text, Image, Video | Text | Yes | Multimodal LLM | -- |
gemma4:31b |
Gemma-4-31B-it-Q4_K_M.gguf |
Google DeepMind | 2026/04 | 2026/04 | 31.000 | 256 | 20.0 | Text, Image, Video | Text | No | Multimodal LLM | -- |
gpt-oss:20b |
GPT-OSS-20B-A4B-Q4_K_M.gguf |
OpenAI | 2025/08 | 2026/06 | 20.000 | 128 | 11.0 | Text | Text | Yes | Conversational LLM | -- |
gpt-oss-mxfp4:20b |
GPT-OSS-20B-A4B-MXFP4.gguf |
OpenAI | 2025/08 | 2026/06 | 20.000 | 128 | 12.0 | Text | Text | Yes | Conversational LLM | -- |
llama2-q4_0:7b |
Llama-2-7B-Q4_0.gguf |
Meta | 2023/07 | 2025/11 | 7.000 | 4 | 3.6 | Text | Text | No | Conversational LLM | -- |
llama2:7b |
Llama-2-7B-Q4_K_M.gguf |
Meta | 2023/07 | 2026/06 | 7.000 | 4 | 3.9 | Text | Text | No | Conversational LLM | -- |
llama2:13b |
Llama-2-13B-Q4_K_M.gguf |
Meta | 2023/07 | 2026/06 | 13.000 | 4 | 7.4 | Text | Text | No | Conversational LLM | -- |
llama3.1:8b |
Llama-2-13B-Q4_K_M.gguf |
Meta | 2024/07 | 2026/06 | 8.000 | 128 | 4.6 | Text | Text | No | Conversational LLM | -- |
llama3.1:70b |
Llama-3.1-70B-Instruct-Q4_K_M.gguf |
Meta | 2024/07 | 2026/06 | 70.000 | 128 | 40.0 | Text | Text | No | Conversational LLM | -- |
llama3.2:1b |
Llama-3.2-1B-Instruct-Q4_K_M.gguf |
Meta | 2024/09 | 2026/06 | 1.000 | 128 | 0.8 | Text | Text | No | Conversational LLM | -- |
llama3.2:3b |
Llama-3.1-70B-Instruct-Q4_K_M.gguf |
Meta | 2024/09 | 2026/06 | 3.000 | 128 | 1.9 | Text | Text | No | Conversational LLM | -- |
ministral3:3b |
Ministral-3-3B-Instruct-2512-Q4_K_M.gguf |
Mistral AI | 2025/05 | 2026/06 | 3.000 | 256 | 2.0 | Text, Image | Text | No | Multimodal LLM | -- |
ministral3:8b |
Ministral-3-8B-Instruct-2512-Q4_K_M.gguf |
Mistral AI | 2025/05 | 2026/06 | 8.000 | 256 | 4.9 | Text, Image | Text | No | Multimodal LLM | -- |
ministral3:14b |
Ministral-3-14B-Instruct-2512-Q4_K_M.gguf |
Mistral AI | 2025/05 | 2026/06 | 14.000 | 256 | 7.7 | Text, Image | Text | No | Multimodal LLM | -- |
mistral-small3.2:24b |
Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M.gguf |
Mistral AI | 2025/06 | 2026/06 | 14.000 | 128 | 14.0 | Text, Image | Text | No | Multimodal LLM | -- |
qwen3.5:0.8b |
Qwen3.5-0.8B-Q4_K_M.gguf |
Alibaba Cloud | 2026/02 | 2026/06 | 0.800 | 256 | 0.5 | Text, Image | Text | No | Multimodal LLM | -- |
qwen3.5:2b |
Qwen3.5-2B-Q4_K_M.gguf |
Alibaba Cloud | 2026/02 | 2026/06 | 2.000 | 256 | 1.2 | Text, Image | Text | No | Multimodal LLM | -- |
qwen3.5:4b |
Qwen3.5-4B-Q4_K_M.gguf |
Alibaba Cloud | 2026/02 | 2026/06 | 4.000 | 256 | 2.6 | Text, Image | Text | No | Multimodal LLM | -- |
qwen3.5:9b |
Qwen3.5-9B-Q4_K_M.gguf |
Alibaba Cloud | 2026/02 | 2026/06 | 9.000 | 256 | 5.3 | Text, Image | Text | No | Multimodal LLM | -- |
qwen3.5:27b |
Qwen3.5-27B-Q4_K_M.gguf |
Alibaba Cloud | 2026/02 | 2026/06 | 27.000 | 256 | 16.0 | Text, Image | Text | No | Multimodal LLM | -- |
qwen3.5:35b |
Qwen3.5-35B-A3B-Q4_K_M.gguf |
Alibaba Cloud | 2026/02 | 2026/06 | 35.000 | 256 | 21.0 | Text, Image | Text | Yes | Multimodal LLM | -- |
qwen3.6:27b |
Qwen3.6-27B-Q4_K_M.gguf |
Alibaba Cloud | 2026/04 | 2026/04 | 27.000 | 256 | 16.0 | Text, Image | Text | No | Multimodal LLM | -- |
qwen3.6:35b |
Qwen3.6-35B-A3B-UD-Q4-K-M.gguf |
Alibaba Cloud | 2026/04 | 2026/04 | 35.000 | 256 | 21.0 | Text, Image | Text | Yes | Multimodal LLM | -- |
Ollama Models¶
| Model Name | Alias of | From | First rel. date | DL date | Params (B) | Context size (K) | Model size on disk (GB) | Input | Output | MoE | Use cases and architecture | Comments |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
codellama:13b |
codellama:13b-instruct-q4_0 |
Meta | 2023/08 | 2026/02 | 13.000 | 16 | 7.4 | Text | Text | No | LLM for coding | -- |
codellama:34b |
codellama:34b-instruct-q4_0 |
Meta | 2023/08 | 2026/02 | 34.000 | 16 | 19.0 | Text | Text | No | LLM for coding | -- |
deepseek-coder-v2:16b |
deepseek-coder-v2:16b-lite-instruct-q4_0 |
Deepseek | 2024/07 | 2026/02 | 16.000 | 160 | 8.9 | Text | Text | Yes | LLM for coding | -- |
deepseek-r1:1.5b |
deepseek-r1:1.5b-qwen-distill-q4_K_M |
Deepseek | 2025/01 | 2026/02 | 1.500 | 128 | 1.1 | Text | Text | No | Conversational LLM | -- |
deepseek-r1:7b |
deepseek-r1:7b-qwen-distill-q4_K_M |
Deepseek | 2025/01 | 2026/02 | 7.000 | 128 | 4.7 | Text | Text | No | Conversational LLM | -- |
deepseek-r1:8b |
deepseek-r1:8b-0528-qwen3-q4_K_M |
Deepseek | 2025/01 | 2026/02 | 8.000 | 128 | 5.2 | Text | Text | No | Conversational LLM | -- |
deepseek-r1:14b |
deepseek-r1:14b-qwen-distill-q4_K_M |
Deepseek | 2025/01 | 2025/10 | 14.800 | 128 | 9.0 | Text | Text | No | Conversational LLM | -- |
deepseek-r1:32b |
deepseek-r1:32b-qwen-distill-q4_K_M |
Deepseek | 2025/01 | 2026/02 | 32.000 | 128 | 20.0 | Text | Text | No | Conversational LLM | -- |
deepseek-r1:70b |
deepseek-r1:70b-llama-distill-q4_K_M |
Deepseek | 2025/01 | 2026/02 | 70.000 | 128 | 43.0 | Text | Text | No | Conversational LLM | -- |
devstral-small-2:24b |
devstral-small-2:24b-instruct-2512-q4_K_M |
Mistral AI | 2025/12 | 2026/01 | 24.000 | 256 | 15.0 | Text, Image | Text | No | LLM for coding | Incompatible with Ollama v0.9.3+IPEX-LLM |
aiasistentworld/ERNIE-4.5-21B-A3B-Thinking-LLM:latest |
Q4_K_M |
Baidu | 2025/06 | 2026/02 | 21.800 | 128 | 13.0 | Text | Text | Yes | Conversational LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
gemma3:270m |
gemma3:270m-it-q8_0 |
Google DeepMind | 2025/03 | 2026/02 | 0.270 | 32 | 0.3 | Text | Text | No | Conversational LLM | Requires Ollama 0.6 or later |
gemma3:1b |
gemma3:1b-it-q4_K_M |
Google DeepMind | 2025/03 | 2026/02 | 1.000 | 32 | 0.8 | Text | Text | No | Conversational LLM | Requires Ollama 0.6 or later |
gemma3:4b |
gemma3:4b-it-q4_K_M |
Google DeepMind | 2025/03 | 2026/02 | 4.000 | 128 | 3.3 | Text, Image | Text | No | Conversational LLM | Requires Ollama 0.6 or later |
gemma3:12b |
gemma3:12b-it-q4_K_M |
Google DeepMind | 2025/03 | 2026/02 | 12.000 | 128 | 8.1 | Text, Image | Text | No | Conversational LLM | Requires Ollama 0.6 or later |
gemma3:27b |
gemma3:27b-it-q4_K_M |
Google DeepMind | 2025/03 | 2026/02 | 27.000 | 128 | 17.0 | Text, Image | Text | No | Conversational LLM | Requires Ollama 0.6 or later |
gemma4:e2b |
gemma4:e2b-it-q4_K_M |
Google DeepMind | 2026/04 | 2026/04 | 5.000 | 128 | 7.2 | Text, Image, Video | Text | Yes | Multimodal LLM | Requires Ollama 0.20.0 or later |
gemma4:e4b |
gemma4:e4b-it-q4_K_M |
Google DeepMind | 2026/04 | 2026/04 | 8.000 | 128 | 9.6 | Text, Image, Video | Text | Yes | Multimodal LLM | Requires Ollama 0.20.0 or later |
gemma4:26b |
gemma4:26b-a4b-it-q4_K_M |
Google DeepMind | 2026/04 | 2026/04 | 26.000 | 256 | 18.0 | Text, Image, Video | Text | Yes | Multimodal LLM | Requires Ollama 0.20.0 or later |
gemma4:31b |
gemma4:31b-it-q4_K_M |
Google DeepMind | 2026/04 | 2026/04 | 31.000 | 256 | 20.0 | Text, Image, Video | Text | No | Multimodal LLM | Requires Ollama 0.20.0 or later |
glm4:9b |
glm4:9b-chat-q4_0 |
Zhipu AI | 2024/06 | 2025/10 | 9.000 | 128 | 5.5 | Text | Text | No | Conversational LLM | Requires Ollama 0.2 or later |
glm-4.7-flash:q4_K_M |
-- | Zhipu AI | 2026/01 | 2025/10 | 30.000 | 198 | 19.0 | Text | Text | Yes | Conversational LLM | Requires Ollama 0.14.3 or later |
glm-4.7-flash:q8_0 |
-- | Zhipu AI | 2026/01 | 2025/10 | 30.000 | 198 | 32.0 | Text | Text | Yes | Conversational LLM | Requires Ollama 0.14.3 or later |
glm-4.7-flash:bf16 |
-- | Zhipu AI | 2026/01 | 2025/10 | 30.000 | 198 | 60.0 | Text | Text | Yes | Conversational LLM | Requires Ollama 0.14.3 or later |
gpt-oss:20b |
-- | OpenAI | 2025/08 | 2025/10 | 20.900 | 128 | 14.0 | Text | Text | Yes | Conversational LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
gpt-oss:120b |
-- | OpenAI | 2025/08 | 2025/10 | 120.000 | 128 | 65.0 | Text | Text | Yes | Conversational LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
granite4:350m |
granite4:350m-bf16 |
IBM | 2025/11 | 2026/02 | 0.350 | 32 | 0.7 | Text | Text | No | Conversational LLM | -- |
granite4:350m-h |
granite4:350m-h-q8_0 |
IBM | 2025/11 | 2026/02 | 0.350 | 32 | 0.4 | Text | Text | Yes | Conversational LLM | -- |
granite4:1b |
granite4:1b-bf16 |
IBM | 2025/11 | 2026/02 | 1.000 | 128 | 3.3 | Text | Text | No | Conversational LLM | -- |
granite4:1b-h |
granite4:1b-h-q8_0 |
IBM | 2025/11 | 2026/02 | 1.000 | 1000000 | 1.6 | Text | Text | Yes | Conversational LLM | -- |
granite4:3b |
granite4:micro (Q4_K_M) |
IBM | 2025/11 | 2026/02 | 3.000 | 128 | 2.1 | Text | Text | No | Conversational LLM | -- |
granite4:3b-h |
granite4:micro-h (Q4_K_M) |
IBM | 2025/11 | 2026/02 | 3.000 | 1000000 | 1.9 | Text | Text | Yes | Conversational LLM | -- |
granite4:7b-a1b-h |
granite4:tiny-h (Q4_K_M) |
IBM | 2025/11 | 2026/02 | 7.000 | 1000000 | 4.2 | Text | Text | Yes | Conversational LLM | -- |
granite4:32b-a9b-h |
granite4:small-h (Q4_K_M) |
IBM | 2025/11 | 2026/02 | 32.000 | 1000000 | 19.0 | Text | Text | Yes | Conversational LLM | -- |
internlm2.5:1.8b-chat |
-- | Shanghai AI Laboratory | 2024/07 | 2025/02 | 1.800 | 32 | 3.8 | Text | Text | No | Conversational LLM | -- |
internlm2.5:7b-chat |
-- | Shanghai AI Laboratory | 2024/07 | 2025/02 | 7.000 | 32 | 15.0 | Text | Text | No | Conversational LLM | -- |
internlm2.5:7b-chat-1m |
-- | Shanghai AI Laboratory | 2024/07 | 2025/02 | 7.000 | 256 | 15.0 | Text | Text | No | Conversational LLM | -- |
internlm2.5:20b-chat |
-- | Shanghai AI Laboratory | 2024/07 | 2025/02 | 20.000 | 32 | 40.0 | Text | Text | No | Conversational LLM | -- |
internlm3-8b-instruct |
-- | Shanghai AI Laboratory | 2025/01 | 2025/02 | 8.000 | 32 | 18.0 | Text | Text | No | Conversational LLM | -- |
llama2:7b |
llama2:7b-chat-q4_0 |
Meta | 2023/02 | 2025/02 | 7.000 | 4 | 3.8 | Text | Text | No | Conversational LLM | -- |
llama2:13b |
llama2:13b-chat-q4_0 |
Meta | 2023/02 | 2025/02 | 13.000 | 4 | 7.4 | Text | Text | No | Conversational LLM | -- |
llama2:70b |
llama2:70b-chat-q4_0 |
Meta | 2023/02 | 2025/02 | 70.000 | 4 | 39.0 | Text | Text | No | Conversational LLM | -- |
llama3.1:8b |
llama3.1:8b-instruct-q4_K_M |
Meta | 2024/07 | 2025/02 | 8.000 | 128 | 4.9 | Text | Text | No | Conversational LLM | -- |
llama3.1:70b |
llama3.1:70b-instruct-q4_K_M |
Meta | 2024/07 | 2025/02 | 70.000 | 128 | 43.0 | Text | Text | No | Conversational LLM | -- |
llama3.2:1b |
llama3.2:1b-instruct-q8_0 |
Meta | 2024/09 | 2025/02 | 1.000 | 128 | 1.3 | Text | Text | No | Conversational LLM | -- |
llama3.2:3b |
llama3.2:3b-instruct-q4_K_M |
Meta | 2024/09 | 2025/02 | 3.000 | 128 | 2.0 | Text | Text | No | Conversational LLM | -- |
llava:13b |
llava:13b-v1.6-vicuna-q4_0 |
Microsoft Research | 2023/10 | 2026/02 | 13.000 | 4 | 8.0 | Text, Image | Text | No | Multimodal LLM | -- |
llava:34b |
llava:34b-v1.6-q4_0 |
Microsoft Research | 2023/10 | 2026/02 | 34.000 | 4 | 20.0 | Text, Image | Text | No | Multimodal LLM | -- |
llava-llama3:8b |
llava-llama3:8b-v1.1-q4_0 |
Microsoft Research | 2024/04 | 2026/02 | 8.000 | 8 | 5.5 | Text, Image | Text | No | Multimodal LLM | -- |
mistral:7b |
mistral:7b-instruct-v0.3-q4_K_M |
Mistral AI | 2023/09 | 2026/03 | 7.000 | 32 | 4.4 | Text | Text | No | Conversational LLM | -- |
mistral-small3.2:24b |
mistral-small3.2:24b-instruct-2506-q4_K_M |
Mistral AI | 2025/06 | 2026/01 | 24.000 | 128 | 15.0 | Text, Image | Text | No | Multimodal LLM | -- |
mistral-nemo |
mistral-nemo:12b-instruct-2407-q4_0 |
Mistral AI | 2024/07 | 2026/03 | 12.000 | 1000 | 7.1 | Text | Text | No | Conversational LLM | -- |
mixtral:8x7b |
mixtral:8x7b-instruct-v0.1-q4_0 |
Mistral AI | 2023/12 | 2026/01 | 57.000 | 32 | 26.0 | Text | Text | Yes | Conversational LLM | -- |
mixtral:8x22b |
mixtral:8x22b-instruct-v0.1-q4_0 |
Mistral AI | 2023/12 | 2025/10 | 140.600 | 64 | 80.0 | Text | Text | Yes | Conversational LLM | -- |
nomic-embed-text-v2-moe |
-- | Nomic AI | 2025/02 | 2026/01 | 0.305 | 512 | 1.0 | Text | Text | Yes | LLM for multilingual retrieval | -- |
olmo-3:7b |
olmo-3:7b-think-q4_K_M |
Allen AI | 2025/11 | 2026/02 | 7.000 | 64 | 4.5 | Text | Text | No | Conversational LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
olmo-3:32b |
olmo-3:32b-think-q4_K_M |
Allen AI | 2025/11 | 2026/02 | 32.000 | 64 | 19.0 | Text | Text | No | Conversational LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
olmo-3.1:32b |
olmo-3.1:32b-think-q4_K_M |
Allen AI | 2025/12 | 2026/02 | 32.000 | 64 | 19.0 | Text | Text | No | Conversational LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
olmo-3.1:32b-instruct |
olmo-3.1:32b-instruct-q4_K_M |
Allen AI | 2025/12 | 2026/02 | 32.000 | 64 | 19.0 | Text | Text | No | Conversational LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
phi4:14b |
phi4:14b-q4_K_M |
Microsoft | 2025/01 | 2026/02 | 14.000 | 16 | 9.1 | Text | Text | No | Conversational LLM | -- |
phi4-mini:3.8b |
phi4-mini:3.8b-q4_K_M |
Microsoft | 2025/01 | 2026/02 | 3.800 | 128 | 2.5 | Text | Text | No | Conversational LLM | -- |
phi4-reasoning:14b |
phi4-reasoning:14b-q4_K_M |
Microsoft | 2025/04 | 2026/02 | 14.000 | 16 | 11.0 | Text | Text | No | Conversational LLM | -- |
phi4-mini-reasoning:3.8b |
phi4-mini-reasoning:3.8b-q4_K_M |
Microsoft | 2025/01 | 2026/02 | 3.800 | 128 | 3.2 | Text | Text | No | Conversational LLM | -- |
qwen2.5:0.5b |
qwen2.5:0.5b-instruct-q4_K_M |
Alibaba Cloud | 2024/09 | 2026/02 | 0.500 | 32 | 0.4 | Text | Text | No | Conversational LLM | -- |
qwen2.5:1.5b |
qwen2.5:1.5b-instruct-q4_K_M |
Alibaba Cloud | 2024/09 | 2026/02 | 1.500 | 32 | 1.0 | Text | Text | No | Conversational LLM | -- |
qwen2.5:3b |
qwen2.5:3b-instruct-q4_K_M |
Alibaba Cloud | 2024/09 | 2026/02 | 3.000 | 32 | 1.9 | Text | Text | No | Conversational LLM | -- |
qwen2.5:7b |
qwen2.5:7b-instruct-q4_K_M |
Alibaba Cloud | 2024/09 | 2026/02 | 7.000 | 32 | 4.7 | Text | Text | No | Conversational LLM | -- |
qwen2.5:14b |
qwen2.5:14b-instruct-q4_K_M |
Alibaba Cloud | 2024/09 | 2026/02 | 14.000 | 32 | 9.0 | Text | Text | No | Conversational LLM | -- |
qwen2.5:32b |
qwen2.5:32b-instruct-q4_K_M |
Alibaba Cloud | 2024/09 | 2026/02 | 32.000 | 32 | 20.0 | Text | Text | No | Conversational LLM | -- |
qwen2.5:72b |
qwen2.5:72b-instruct-q4_K_M |
Alibaba Cloud | 2024/09 | 2026/02 | 72.000 | 32 | 47.0 | Text | Text | No | Conversational LLM | -- |
qwen2.5vl:7b |
qwen2.5vl:7b-q4_K_M |
Alibaba Cloud | 2024/12 | 2026/02 | 32.000 | 125 | 6.0 | Text, Image | Text | No | Multimodal LLM | -- |
qwen2.5vl:32b |
qwen2.5vl:32b-q4_K_M |
Alibaba Cloud | 2024/12 | 2026/02 | 32.000 | 125 | 21.0 | Text, Image | Text | No | Multimodal LLM | -- |
qwen3:0.6b |
qwen3:0.6b-q4_K_M |
Alibaba Cloud | 2025/04 | 2025/10 | 0.600 | 40 | 0.5 | Text | Text | No | Conversational LLM | -- |
qwen3:1.7b |
qwen3:1.7b-q4_K_M |
Alibaba Cloud | 2025/04 | 2025/10 | 1.700 | 40 | 1.4 | Text | Text | No | Conversational LLM | -- |
qwen3:4b |
qwen3:4b-q4_K_M |
Alibaba Cloud | 2025/04 | 2025/10 | 4.000 | 256 | 2.5 | Text | Text | No | Conversational LLM | -- |
qwen3:8b |
qwen3:4b-thinking-2507-q4_K_M |
Alibaba Cloud | 2025/04 | 2025/10 | 8.000 | 40 | 5.2 | Text | Text | No | Conversational LLM | -- |
qwen3:14b |
qwen3:14b-thinking-2507-q4_K_M |
Alibaba Cloud | 2025/04 | 2025/10 | 14.000 | 40 | 9.3 | Text | Text | No | Conversational LLM | -- |
qwen3:30b |
qwen3:30b-a3b-thinking-2507-q4_K_M |
Alibaba Cloud | 2025/04 | 2025/10 | 30.500 | 256 | 19.0 | Text | Text | Yes | Conversational LLM | -- |
qwen3:32b |
qwen3:32b-q4_K_M |
Alibaba Cloud | 2025/04 | 2026/02 | 32.000 | 40 | 20.0 | Text | Text | No | Conversational LLM | -- |
qwen3-coder:30b |
qwen3-coder:30b-a3b-q4_K_M |
Alibaba Cloud | 2025/08 | 2025/10 | 30.500 | 256 | 19.0 | Text | Text | Yes | LLM for coding | -- |
qwen3-coder-next:latest |
qwen3-coder-next:q4_K_M |
Alibaba Cloud | 2026/02 | 2026/03 | 80.000 | 256 | 52.0 | Text | Text | Yes | LLM for coding | -- |
qwen3-vl:2b |
qwen3-vl:2b-thinking-q4_K_M |
Alibaba Cloud | 2025/10 | 2026/02 | 2.000 | 256 | 1.9 | Text, Image | Text | No | Multimodal LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
qwen3-vl:4b |
qwen3-vl:4b-thinking-q4_K_M |
Alibaba Cloud | 2025/10 | 2026/02 | 4.000 | 256 | 3.3 | Text, Image | Text | No | Multimodal LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
qwen3-vl:8b |
qwen3-vl:8b-thinking-q4_K_M |
Alibaba Cloud | 2025/10 | 2026/02 | 8.000 | 256 | 6.1 | Text, Image | Text | No | Multimodal LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
qwen3-vl:30b |
qwen3-vl:30b-a3b-thinking-q4_K_M |
Alibaba Cloud | 2025/10 | 2026/02 | 30.000 | 256 | 20.0 | Text, Image | Text | Yes | Multimodal LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
qwen3-vl:32b |
qwen3-vl:32b-thinking-q4_K_M |
Alibaba Cloud | 2025/10 | 2026/02 | 32.000 | 256 | 21.0 | Text, Image | Text | No | Multimodal LLM | Incompatible with Ollama v0.9.3+IPEX-LLM |
qwen3.5:0.8b |
qwen3.5:0.8b-q8_0 |
Alibaba Cloud | 2026/02 | 2026/03 | 0.800 | 256 | 1.0 | Text, Image | Text | No | Multimodal LLM | Requires Ollama 0.17.4 or later |
qwen3.5:2b |
qwen3.5:2b-q8_0 |
Alibaba Cloud | 2026/02 | 2026/03 | 2.000 | 256 | 2.7 | Text, Image | Text | No | Multimodal LLM | Requires Ollama 0.17.4 or later |
qwen3.5:4b |
qwen3.5:4b-q4_K_M |
Alibaba Cloud | 2026/02 | 2026/03 | 4.000 | 256 | 3.4 | Text, Image | Text | No | Multimodal LLM | Requires Ollama 0.17.4 or later |
qwen3.5:9b |
qwen3.5:9b-q4_K_M |
Alibaba Cloud | 2026/02 | 2026/03 | 9.000 | 256 | 6.6 | Text, Image | Text | No | Multimodal LLM | Requires Ollama 0.17.4 or later |
qwen3.5:27b |
qwen3.5:27b-q4_K_M |
Alibaba Cloud | 2026/02 | 2026/03 | 27.000 | 256 | 17.0 | Text, Image | Text | No | Multimodal LLM | Requires Ollama 0.17.4 or later |
qwen3.5:35b |
qwen3.5:35b-a3b-q4_K_M |
Alibaba Cloud | 2026/02 | 2026/03 | 35.000 | 256 | 24.0 | Text, Image | Text | Yes | Multimodal LLM | Requires Ollama 0.17.4 or later |
qwen3.5:122b |
qwen3.5:122b-a10b-q4_K_M |
Alibaba Cloud | 2026/02 | 2026/03 | 122.000 | 256 | 81.0 | Text, Image | Text | Yes | Multimodal LLM | Requires Ollama 0.17.4 or later |
qwen3.6:27b |
qwen3.6:27b-q4_K_M |
Alibaba Cloud | 2026/04 | 2026/04 | 27.000 | 256 | 16.0 | Text, Image | Text | No | Multimodal LLM | -- |
qwen3.6:35b |
qwen3.6:35b-a3b-q4_K_M |
Alibaba Cloud | 2026/04 | 2026/04 | 35.000 | 256 | 23.0 | Text, Image | Text | Yes | Multimodal LLM | -- |
Hugging Face Models¶
| Model Name | From | First rel. date | DL date | Params (B) | Context size (K) | Model size on disk (GB) | Input | Output | MoE | Use cases and architecture | Comments |
|---|---|---|---|---|---|---|---|---|---|---|---|
donut-base |
NAVER Labs AI | 2021/11 | 2026/01 | 0.250 | -- | 0.8 | Text, Image, PDF | Text | No | Document understanding (transformer enc-dec) | OCR-free |
layoutlmv2-base-uncased |
Microsoft Research Asia | 2020/12 | 2026/01 | 0.200 | -- | 0.8 | Text, Image | Text | No | Document understanding (transformer enc-only) | With OCR |
layoutlmv3-base |
Microsoft Research Asia | 2022/04 | 2026/01 | 0.100 | -- | 1.9 | Text, Image, PDF | Text | No | Document understanding (transformer enc-only) | With OCR |
roberta-base-squad2 |
Deepset | 2023/06 | 2026/01 | 0.100 | -- | 2.4 | Text | Text | No | Extractive QA (transformer enc-only) | -- |
distilbert-base-cased |
Hugging Face | 2019/09 | 2026/01 | 0.065 | -- | 1.1 | Text | Text | No | Extractive QA (transformer enc-only) | -- |
bart-large-cnn |
Facebook AI | 2019/10 | 2026/01 | 0.400 | -- | 8.0 | Text | Text | No | Text summary (transformer enc-dec) | -- |
pegasus-cnn_dailymail |
Google Research | 2018/12 | 2026/01 | 7.000 | -- | 5.0 | Text | Text | No | Text summary (transformer enc-dec) | -- |
t5-base |
Google Research | 2018/05 | 2026/01 | 0.200 | -- | 4.2 | Text | Text | No | Text summary (transformer enc-dec) | -- |
PP-OCRv5_server_det |
PaddleOCR Team, Baidu | 2025/09 | 2026/01 | 0.100 | -- | 0.1 | Image, PDF | Text, Bounding boxes | No | Multimod CNN + transformer (txt detec + recog) | OCR to raw text |
idefics2-8b |
Hugging Face | 2024/04 | 2026/01 | 0.100 | -- | 32.0 | Text, Image | Text | No | Vision + language with summary and analysis | -- |
Segment-Anything-Model-2 |
Meta | 2024/07 | 2026/01 | 0.033 | -- | 0.1 | Image, Video | Image, Video | No | Vision-only and segmentation | -- |
gpt-oss-20b |
OpenAI | 2025/08 | 2026/02 | 20.900 | 128 | 14.0 | Text | Text | Yes | Conversational LLM | Corrupted |
Qwen2.5-VL-72B-Instruct |
Alibaba Cloud | 2024/09 | 2026/01 | 73.000 | 125 | 137.0 | Text, Image | Text | No | Multimodal LLM | -- |
Qwen2.5-VL-72B-Instruct-FP8-dynamic |
Alibaba Cloud | 2024/09 | 2026/01 | 73.000 | 125 | 72.0 | Text, Image | Text | No | Multimodal LLM | -- |
Llama-3.2-90B-Vision-Instruct-FP8-dynamic |
Meta | 2024/09 | 2026/01 | 89.000 | 128 | 86.0 | Text, Image | Text | No | Multimodal LLM | -- |
FLUX.1-dev |
Black Forest Labs | 2024/08 | 2026/02 | 12.000 | -- | 54.0 | Text | Image | No | Image gen (transformer + diffusion => FLUX) | -- |
FLUX.2-klein-9B |
Black Forest Labs | 2025/11 | 2026/02 | 9.000 | 40 | 50.0 | Text, Image | Image | No | Image gen (transformer + diffusion => FLUX) | Should work on RTX 4090 (~29 GB VRAM) |
FLUX.2-dev |
Black Forest Labs | 2025/11 | 2026/02 | 32.000 | -- | 166.0 | Text, Image | Image | No | Image gen (transformer + diffusion => FLUX) | -- |
FLUX.2-dev-bnb-4bit |
Black Forest Labs | 2025/11 | 2026/02 | 32.000 | -- | 32.0 | Text, Image | Image | No | Image gen (transformer + diffusion => FLUX) | Should work on RTX 4090 (~18 GB VRAM) |
Technical Details about ai-models Group¶
For users in the ai-models group, it has been ensured that created files and
folders will have the ai-models group by default. For this, the setgid bit
has been added on /mnt/data/ai-models and sub-folders:
Then, still in the /mnt/data/ai-models folder, the default group rights have
been updated to force rwx on new created folders and rw on new created
files:
# install ACL to have the `setfacl` command
sudo apt install acl
# apply ACL to existing files
find /mnt/data/ai-models -type d -exec sudo setfacl -m g:ai-models:rwx {} +
find /mnt/data/ai-models -type f -exec sudo setfacl -m g:ai-models:rw- {} +
# apply ACL to the future files
sudo setfacl -R -d -m g:ai-models:rwx /mnt/data/ai-models