Spaces:
Running
Running
init0
Browse files- README.md +0 -2
- index.html +152 -18
- style.css +0 -28
- styles.css +36 -0
README.md
CHANGED
@@ -8,5 +8,3 @@ pinned: false
|
|
8 |
license: mit
|
9 |
short_description: Top Open-Source Small Language Models for Generative AI Appl
|
10 |
---
|
11 |
-
|
12 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
8 |
license: mit
|
9 |
short_description: Top Open-Source Small Language Models for Generative AI Appl
|
10 |
---
|
|
|
|
index.html
CHANGED
@@ -1,19 +1,153 @@
|
|
1 |
-
<!
|
2 |
-
<html>
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
</html>
|
|
|
1 |
+
<!DOCTYPE html>
|
2 |
+
<html lang="en">
|
3 |
+
<head>
|
4 |
+
<meta charset="UTF-8">
|
5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1">
|
6 |
+
<title>Top Open-Source Small Language Models</title>
|
7 |
+
<link rel="stylesheet" href="styles.css"/>
|
8 |
+
</head>
|
9 |
+
<body>
|
10 |
+
|
11 |
+
<h1>Top Open-Source Small Language Models for Generative AI Applications</h1>
|
12 |
+
|
13 |
+
<p>
|
14 |
+
Small Language Models (SLMs) are language models that contain, at most, a few billion parametersβsignificantly fewer
|
15 |
+
than Large Language Models (LLMs), which can have tens, hundreds of billions, or even trillions, of parameters. SLMs
|
16 |
+
are well-suited for resource-constrained environments, as well as on-device and real-time generative AI
|
17 |
+
applications. Many of them can run locally on a laptop using tools like LM Studio or Ollama . These models are
|
18 |
+
typically derived from larger models using techniques such as quantization and distillation. In the following, some
|
19 |
+
well developed SLMs are introduced.
|
20 |
+
</p>
|
21 |
+
<p>
|
22 |
+
Note: All the models mentioned here are open source. However, for details regarding experimental use, commercial
|
23 |
+
use, redistribution, and other terms, please refer to the license documentation.
|
24 |
+
</p>
|
25 |
+
|
26 |
+
<h2>Phi 4 Collection by Microsoft</h2>
|
27 |
+
<p>
|
28 |
+
This Collection features a range of small language models, including reasoning models, ONNX- and GGUF-compatible
|
29 |
+
formats, and multimodal models. The base model in the collection has 14 billion parameters, while the smallest
|
30 |
+
models have 3.84 billion. Strategic use of synthetic data during training has led to improved performance compared
|
31 |
+
to its mother model (primarily GPT-4). Currently, the collection includes three versions of reasoning-focused SLMs,
|
32 |
+
making it one of the best solutions for reasoning tasks.
|
33 |
+
</p>
|
34 |
+
<p>
|
35 |
+
π Licence: <a href="https://choosealicense.com/licenses/mit/" target="_blank">MIT</a><br>
|
36 |
+
π <a href="https://huggingface.co/collections/microsoft/phi-4-677e9380e514feb5577a40e4" target="_blank">Collection
|
37 |
+
on Hugging Face</a><br>
|
38 |
+
π <a href="https://arxiv.org/abs/2412.08905" target="_blank">Technical Report</a>
|
39 |
+
</p>
|
40 |
+
|
41 |
+
<h2>Gemma 3 Collection by Google</h2>
|
42 |
+
<p>
|
43 |
+
This collection features multiple versions, including Image-to-Text, Text-to-Text, and Image-and-Text-to-Text
|
44 |
+
models, available in both quantized and GGUF formats. The models vary in size, with 1, 4.3, 12.2, and 27.4 billion
|
45 |
+
parameters. Two specialized variants have been developed for specific applications: TxGemma, optimized for
|
46 |
+
therapeutic development, and ShieldGemma, designed for moderating text and image content.
|
47 |
+
</p>
|
48 |
+
<p>
|
49 |
+
π Licence: <a href="https://ai.google.dev/gemma/terms" target="_blank">Gemma</a><br>
|
50 |
+
π <a href="https://huggingface.co/collections/google/gemma-3-release-67c6c6f89c4f76621268bb6d" target="_blank">Collection
|
51 |
+
on Hugging Face</a><br>
|
52 |
+
π <a href="https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf" target="_blank">Technical
|
53 |
+
Report</a><br>
|
54 |
+
π <a href="https://huggingface.co/collections/google/shieldgemma-67d130ef8da6af884072a789" target="_blank">ShieldGemma
|
55 |
+
on Hugging Face</a><br>
|
56 |
+
π <a href="https://huggingface.co/collections/google/txgemma-release-67dd92e931c857d15e4d1e87" target="_blank">TxGemma
|
57 |
+
on Hugging Face</a>
|
58 |
+
</p>
|
59 |
+
|
60 |
+
<h2>Mistral Models</h2>
|
61 |
+
<p>
|
62 |
+
Mistral AI is a France-based AI startup and one of the pioneers in releasing open-source language models. Its
|
63 |
+
current product lineup includes three compact models: Mistral Small 3.1, Pixtral 12B, and Mistral NEMO. All of them
|
64 |
+
are released under <a href="https://www.apache.org/licenses/LICENSE-2.0" target="_blank">Apache 2.0 license</a>.
|
65 |
+
</p>
|
66 |
+
|
67 |
+
<p>
|
68 |
+
<b>Mistral 3.1</b> is a multimodal and multilingual SLM having 24 billion parameters and 128K context window.
|
69 |
+
Currently there are two versions: Base and Instruct.<br>
|
70 |
+
π <a href="https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Base-2503" target="_blank">Base Version on Hugging
|
71 |
+
Face</a><br>
|
72 |
+
π <a href="https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503" target="_blank">Instruct Version on
|
73 |
+
Hugging Face</a><br>
|
74 |
+
π <a href="https://mistral.ai/news/mistral-small-3-1" target="_blank">Technical Report</a><br>
|
75 |
+
</p>
|
76 |
+
|
77 |
+
<p>
|
78 |
+
<b>Pixtral 12B</b> is a natively multimodal model trained on interleaved image and text data, delivering strong
|
79 |
+
performance on multimodal tasks and instruction following while maintaining state-of-the-art results on text-only
|
80 |
+
benchmarks. It features a newly developed 400M parameter vision encoder and a 12B parameter multimodal decoder based
|
81 |
+
on Mistral NEMO. The model supports variable image sizes, aspect ratios, and multiple images within a long context
|
82 |
+
window of up to 128k tokens.<br>
|
83 |
+
π <a href="https://huggingface.co/mistralai/Pixtral-12B-Base-2409" target="_blank">Pixtral-12B-Base-2409 on Hugging
|
84 |
+
Face</a><br>
|
85 |
+
π <a href="https://huggingface.co/mistralai/Pixtral-12B-2409" target="_blank">Pixtral-12B-2409 on Hugging
|
86 |
+
Face</a><br>
|
87 |
+
π <a href="https://mistral.ai/news/pixtral-12b" target="_blank">Technical Report</a><br>
|
88 |
+
</p>
|
89 |
+
|
90 |
+
<p>
|
91 |
+
<b>Mistral NeMo</b> is a 12B model developed in collaboration with NVIDIA, featuring a large 128k-token context
|
92 |
+
window and state-of-the-art reasoning, knowledge, and coding accuracy for its size.<br>
|
93 |
+
π <a href="https://huggingface.co/mistralai/Mistral-Nemo-Instruct-FP8-2407" target="_blank">Model on Hugging
|
94 |
+
Face</a><br>
|
95 |
+
π <a href="https://mistral.ai/news/mistral-nemo" target="_blank">Technical Report</a>
|
96 |
+
</p>
|
97 |
+
|
98 |
+
<h2>Llama Models by Meta</h2>
|
99 |
+
<p>
|
100 |
+
Meta is one of the leading contributors to open-source AI. In recent years, it has released several versions of its
|
101 |
+
Llama models. The latest series is Llama 4, although all models in this collection are currently quite large.
|
102 |
+
Smaller models may be introduced in the future or in upcoming sub-versions of Llama 4, but for now, that hasnβt
|
103 |
+
happened. The most recent collection that includes smaller models is Llama 3.2. It features models with 1.24 billion
|
104 |
+
and 3.21 billion parameters with 128k context windows. Additionally, there is a 10.6 billion-parameter multimodal
|
105 |
+
version designed for Image-and-Text-to-Text tasks.
|
106 |
+
This collection includes small variants of Llama Guard β fine-tuned language models designed for prompt and response
|
107 |
+
classification. They can detect unsafe prompts and responses, making them useful for implementing safety measures in
|
108 |
+
LLM-based applications.
|
109 |
+
</p>
|
110 |
+
<p>
|
111 |
+
π License: <a href="https://www.llama.com/llama3_2/license/" target="_blank">LLAMA 3.2 COMMUNITY LICENSE
|
112 |
+
AGREEMENT</a><br>
|
113 |
+
π <a href="https://huggingface.co/collections/meta-llama/llama-32-66f448ffc8c32f949b04c8cf" target="_blank">Collection
|
114 |
+
on Hugging Face</a><br>
|
115 |
+
π <a href="https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/" target="_blank">Technical
|
116 |
+
Paper</a>
|
117 |
+
</p>
|
118 |
+
|
119 |
+
<h2>Qwen 3 Collection by Alibaba</h2>
|
120 |
+
<p>
|
121 |
+
The Chinese tech giant Alibaba is another major player in open-source AI. It releases its language models under the
|
122 |
+
Qwen name. The latest version is Qwen 3, which includes both small and large models. The smaller models range in
|
123 |
+
size, with parameter counts of 14.8 billion, 8.19 billion, 4.02 billion, 2.03 billion, and even 752 million. This
|
124 |
+
collection also includes quantized and GGUF formats.
|
125 |
+
</p>
|
126 |
+
<p>
|
127 |
+
π Licence: <a href="https://www.apache.org/licenses/LICENSE-2.0" target="_blank">Apache 2.0</a><br>
|
128 |
+
π <a href="https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f" target="_blank">Collection on
|
129 |
+
Hugging Face</a><br>
|
130 |
+
π <a href="https://github.com/QwenLM/Qwen3/blob/main/Qwen3_Technical_Report.pdf" target="_blank">Technical
|
131 |
+
Report</a>
|
132 |
+
</p>
|
133 |
+
|
134 |
+
<hr style="border: none; height: 1px; background-color: #ccc;">
|
135 |
+
|
136 |
+
<p>This list is not limited to these five. You can explore more open-source models at:</p>
|
137 |
+
<ul>
|
138 |
+
<li><a href="https://huggingface.co/databricks" target="_blank">Databricks</a></li>
|
139 |
+
<li><a href="https://huggingface.co/Cohere" target="_blank">Cohere</a></li>
|
140 |
+
<li><a href="https://huggingface.co/deepseek-ai" target="_blank">Deepseek</a></li>
|
141 |
+
<li><a href="https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966" target="_blank">SmolLM</a>
|
142 |
+
</li>
|
143 |
+
<li><a href="https://huggingface.co/stabilityai" target="_blank">Stability AI</a></li>
|
144 |
+
<li><a href="https://huggingface.co/ibm-granite" target="_blank">IBM Granite</a></li>
|
145 |
+
</ul>
|
146 |
+
|
147 |
+
<hr style="border: none; height: 1px; background-color: #ccc;">
|
148 |
+
|
149 |
+
<p>* Cover photo generated with <a href="https://ideogram.ai/" target="_blank">Ideogram</a></p>
|
150 |
+
<p>* All models are available on Hugging Face.</p>
|
151 |
+
|
152 |
+
</body>
|
153 |
</html>
|
style.css
DELETED
@@ -1,28 +0,0 @@
|
|
1 |
-
body {
|
2 |
-
padding: 2rem;
|
3 |
-
font-family: -apple-system, BlinkMacSystemFont, "Arial", sans-serif;
|
4 |
-
}
|
5 |
-
|
6 |
-
h1 {
|
7 |
-
font-size: 16px;
|
8 |
-
margin-top: 0;
|
9 |
-
}
|
10 |
-
|
11 |
-
p {
|
12 |
-
color: rgb(107, 114, 128);
|
13 |
-
font-size: 15px;
|
14 |
-
margin-bottom: 10px;
|
15 |
-
margin-top: 5px;
|
16 |
-
}
|
17 |
-
|
18 |
-
.card {
|
19 |
-
max-width: 620px;
|
20 |
-
margin: 0 auto;
|
21 |
-
padding: 16px;
|
22 |
-
border: 1px solid lightgray;
|
23 |
-
border-radius: 16px;
|
24 |
-
}
|
25 |
-
|
26 |
-
.card p:last-child {
|
27 |
-
margin-bottom: 0;
|
28 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
styles.css
ADDED
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
body {
|
2 |
+
font-family: "Segoe UI", Tahoma, Geneva, Verdana, sans-serif;
|
3 |
+
background-color: #f8f9fa;
|
4 |
+
color: #212529;
|
5 |
+
line-height: 1.6;
|
6 |
+
padding: 2rem;
|
7 |
+
max-width: 900px;
|
8 |
+
margin: auto;
|
9 |
+
}
|
10 |
+
h1 {
|
11 |
+
color: #0d6efd;
|
12 |
+
border-bottom: 2px solid #dee2e6;
|
13 |
+
padding-bottom: 0.5rem;
|
14 |
+
}
|
15 |
+
h2 {
|
16 |
+
color: #198754;
|
17 |
+
margin-top: 2rem;
|
18 |
+
}
|
19 |
+
a {
|
20 |
+
color: #0d6efd;
|
21 |
+
text-decoration: none;
|
22 |
+
}
|
23 |
+
a:hover {
|
24 |
+
text-decoration: underline;
|
25 |
+
}
|
26 |
+
code {
|
27 |
+
background-color: #e9ecef;
|
28 |
+
padding: 0.2rem 0.4rem;
|
29 |
+
border-radius: 0.25rem;
|
30 |
+
font-family: monospace;
|
31 |
+
}
|
32 |
+
.cover-image {
|
33 |
+
max-width: 100%;
|
34 |
+
height: auto;
|
35 |
+
margin: 1rem 0;
|
36 |
+
}
|