treasure4l
/

Llama3.2-Instruct-DPO

Model card Files Files and versions Community

treasure4l commited on Jan 14

Commit

64b174c

·

verified ·

1 Parent(s): fb37779

Update README.md

Files changed (1) hide show

README.md +6 -14

README.md CHANGED Viewed

@@ -1,12 +1,14 @@
 ---
 base_model: unsloth/llama-3.2-3b-instruct-bnb-4bit
-library_name: peft
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
@@ -17,21 +19,11 @@ library_name: peft
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
 - **Model type:** [More Information Needed]
 - **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses

 ---
 base_model: unsloth/llama-3.2-3b-instruct-bnb-4bit
+datasets:
+- trl-lib/ultrafeedback_binarized
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
+The Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks.
+This is a version of the model that has undergone Direct Preference Optimization (DPO) training using the ultrafeedback dataset.
 ## Model Details
+- **Developed by:** Treasure Mayowa
 - **Model type:** [More Information Needed]
 - **Language(s) (NLP):** [More Information Needed]
 - **License:** [More Information Needed]
+- **Finetuned from model [optional]:** Llama 3.2 Instruct
 ## Uses