157 15 404

Djuunaa

djuna

AI & ML interests

None yet

Recent Activity

liked a model about 8 hours ago

rednote-hilab/dots.llm1.inst

new activity about 8 hours ago

rednote-hilab/dots.llm1.inst:Game-changer for 4x24GB setups! AWQ request

liked a model about 17 hours ago

Gen-Verse/ReasonFlux-Coder-14B

View all activity

Organizations

djuna's activity

liked a model about 8 hours ago

rednote-hilab/dots.llm1.inst

Text Generation • Updated about 16 hours ago • 23 • 60

New activity in rednote-hilab/dots.llm1.inst about 8 hours ago

Game-changer for 4x24GB setups! AWQ request

#1 opened about 9 hours ago by

hyunw55

liked 2 models about 17 hours ago

Gen-Verse/ReasonFlux-Coder-14B

Text Generation • Updated 3 days ago • 85 • 11

TIGER-Lab/VisCoder-7B

Updated 1 day ago • 25 • 4

reacted to MJannik's post with 🤝 about 17 hours ago

Post

1378

Hi everyone, we’ve got big news! Starting today, all Langfuse product features are available as free OSS (MIT license).

You can now upgrade your self-hosted Langfuse to access features like:
- Managed LLM-as-a-Judge evaluations
- Annotation queues
- Prompt experiments
- LLM playground

We’re incredibly grateful for the support of this amazing community and can’t wait to hear your feedback on the new features!

More on this change here: https://langfuse.com/blog/2025-06-04-open-sourcing-langfuse-product

liked a model about 18 hours ago

open-thoughts/OpenThinker3-7B

Text Generation • Updated 1 day ago • 273 • 36

liked a model 2 days ago

fishaudio/openaudio-s1-mini

Text-to-Speech • Updated 4 days ago • 538 • 109

reacted to eaddario's post with 🚀 2 days ago

Post

1231

Layer-wise and Pruned versions of google/gemma-3-12b-it

After enhancing llama.cpp to handle user-defined quantization levels for arbitrary tensors (https://github.com/ggml-org/llama.cpp/pull/12511), I have added an option to prune whole layers (https://github.com/ggml-org/llama.cpp/pull/13037), and have published two versions of google/gemma-3-12b-it for demo and testing purposes:

* Tesor-wise: eaddario/gemma-3-12b-it-GGUF
* Pruned: eaddario/gemma-3-12b-it-pruned-GGUF

Even though the Perplexity scores of the pruned version are 3 times higher, the ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores are holding remarkably well, considering two layers were removed (26 and 29). This seems to support Xin Men et al conclusions in ShortGPT: Layers in Large Language Models are More Redundant Than You Expect (2403.03853)

Results summary in the model's card and test results in the ./scores directory. Questions/feedback is always welcomed.