Djuunaa

djuna

AI & ML interests

None yet

Recent Activity

Organizations

Dev Mode Explorers's profile picture Djuna Test Lab's profile picture

djuna's activity

New activity in rednote-hilab/dots.llm1.inst about 8 hours ago
reacted to MJannik's post with 🤝 about 17 hours ago
view post
Post
1378
Hi everyone, we’ve got big news! Starting today, all Langfuse product features are available as free OSS (MIT license).

You can now upgrade your self-hosted Langfuse to access features like:
- Managed LLM-as-a-Judge evaluations
- Annotation queues
- Prompt experiments
- LLM playground

We’re incredibly grateful for the support of this amazing community and can’t wait to hear your feedback on the new features!

More on this change here: https://langfuse.com/blog/2025-06-04-open-sourcing-langfuse-product
reacted to eaddario's post with 🚀 2 days ago
view post
Post
1231
Layer-wise and Pruned versions of google/gemma-3-12b-it

After enhancing llama.cpp to handle user-defined quantization levels for arbitrary tensors (https://github.com/ggml-org/llama.cpp/pull/12511), I have added an option to prune whole layers (https://github.com/ggml-org/llama.cpp/pull/13037), and have published two versions of google/gemma-3-12b-it for demo and testing purposes:

* Tesor-wise: eaddario/gemma-3-12b-it-GGUF
* Pruned: eaddario/gemma-3-12b-it-pruned-GGUF

Even though the Perplexity scores of the pruned version are 3 times higher, the ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores are holding remarkably well, considering two layers were removed (26 and 29). This seems to support Xin Men et al conclusions in ShortGPT: Layers in Large Language Models are More Redundant Than You Expect (2403.03853)

Results summary in the model's card and test results in the ./scores directory. Questions/feedback is always welcomed.