Text Generation
PEFT
Safetensors
mistral
conversational
merve HF Staff dfurman commited on
Commit
c87cf68
·
0 Parent(s):

Duplicate from dfurman/Mistral-7B-Instruct-v0.2

Browse files

Co-authored-by: Daniel Furman <dfurman@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,263 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ library_name: peft
4
+ tags:
5
+ - mistral
6
+ datasets:
7
+ - jondurbin/airoboros-2.2.1
8
+ - Open-Orca/SlimOrca
9
+ - garage-bAInd/Open-Platypus
10
+ inference: false
11
+ pipeline_tag: text-generation
12
+ base_model: mistralai/Mistral-7B-v0.1
13
+ ---
14
+
15
+ <div align="center">
16
+
17
+ <img src="./logo.png" width="110px">
18
+
19
+ </div>
20
+
21
+
22
+ # Mistral-7B-Instruct-v0.2
23
+
24
+ A pretrained generative language model with 7 billion parameters geared towards instruction-following capabilities.
25
+
26
+ ## Model Details
27
+
28
+ This model was built via parameter-efficient finetuning of the [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) base model on the first 20k rows in each of the [jondurbin/airoboros-2.2.1](https://huggingface.co/datasets/jondurbin/airoboros-2.2.1), [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca), and [garage-bAInd/Open-Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) datasets.
29
+
30
+ - **Developed by:** Daniel Furman
31
+ - **Model type:** Causal language model (clm)
32
+ - **Language(s) (NLP):** English
33
+ - **License:** Apache 2.0
34
+ - **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
35
+
36
+ ## Model Sources
37
+
38
+ - **Repository:** [here](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/mistral/sft_Mistral_7B_Instruct_v0_1_peft.ipynb)
39
+
40
+ ## Evaluation Results
41
+
42
+ | Metric | Value |
43
+ |-----------------------|-------|
44
+ | MMLU (5-shot) | Coming |
45
+ | ARC (25-shot) | Coming |
46
+ | HellaSwag (10-shot) | Coming |
47
+ | TruthfulQA (0-shot) | Coming |
48
+ | Avg. | Coming |
49
+
50
+ We use Eleuther.AI's [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above, the same version as Hugging Face's [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
51
+
52
+ ## Basic Usage
53
+
54
+ <details>
55
+
56
+ <summary>Setup</summary>
57
+
58
+ ```python
59
+ !pip install -q -U transformers peft torch accelerate einops sentencepiece
60
+ ```
61
+
62
+ ```python
63
+ import torch
64
+ from peft import PeftModel, PeftConfig
65
+ from transformers import (
66
+ AutoModelForCausalLM,
67
+ AutoTokenizer,
68
+ )
69
+ ```
70
+
71
+ ```python
72
+ peft_model_id = "dfurman/Mistral-7B-Instruct-v0.2"
73
+ config = PeftConfig.from_pretrained(peft_model_id)
74
+
75
+ tokenizer = AutoTokenizer.from_pretrained(
76
+ peft_model_id,
77
+ use_fast=True,
78
+ trust_remote_code=True,
79
+ )
80
+
81
+ model = AutoModelForCausalLM.from_pretrained(
82
+ config.base_model_name_or_path,
83
+ torch_dtype=torch.float16,
84
+ device_map="auto",
85
+ trust_remote_code=True,
86
+ )
87
+
88
+ model = PeftModel.from_pretrained(
89
+ model,
90
+ peft_model_id
91
+ )
92
+ ```
93
+
94
+ </details>
95
+
96
+
97
+ ```python
98
+ messages = [
99
+ {"role": "user", "content": "Tell me a recipe for a mai tai."},
100
+ ]
101
+
102
+ print("\n\n*** Prompt:")
103
+ input_ids = tokenizer.apply_chat_template(
104
+ messages,
105
+ tokenize=True,
106
+ return_tensors="pt",
107
+ )
108
+ print(tokenizer.decode(input_ids[0]))
109
+
110
+ print("\n\n*** Generate:")
111
+ with torch.autocast("cuda", dtype=torch.bfloat16):
112
+ output = model.generate(
113
+ input_ids=input_ids.cuda(),
114
+ max_new_tokens=1024,
115
+ do_sample=True,
116
+ temperature=0.7,
117
+ return_dict_in_generate=True,
118
+ eos_token_id=tokenizer.eos_token_id,
119
+ pad_token_id=tokenizer.pad_token_id,
120
+ repetition_penalty=1.2,
121
+ no_repeat_ngram_size=5,
122
+ )
123
+
124
+ response = tokenizer.decode(
125
+ output["sequences"][0][len(input_ids[0]):],
126
+ skip_special_tokens=True
127
+ )
128
+ print(response)
129
+ ```
130
+
131
+ <details>
132
+
133
+ <summary>Outputs</summary>
134
+
135
+ **Prompt**:
136
+
137
+ ```python
138
+ "<s> [INST] Tell me a recipe for a mai tai. [/INST]"
139
+ ```
140
+
141
+ **Generation**:
142
+
143
+ ```python
144
+ """1. Combine the following ingredients in a cocktail shaker:
145
+ 2 oz light rum (or white rum)
146
+ 1 oz dark rum
147
+ 0.5 oz orange curacao or triple sec
148
+ 0.75 oz lime juice, freshly squeezed
149
+ 0.5 tbsp simple syrup (optional; if you like your drinks sweet)
150
+ Few drops of bitters (Angostura is traditional but any will do)
151
+ Ice cubes to fill the shaker
152
+
153
+ 2. Shake vigorously until well-chilled and combined.
154
+ 3. Strain into an ice-filled glass.
155
+ 4. Garnish with a slice of lime or an orange wedge, if desired."""
156
+ ```
157
+
158
+ </details>
159
+
160
+
161
+ ## Speeds, Sizes, Times
162
+
163
+ | runtime / 50 tokens (sec) | GPU | dtype | VRAM (GB) |
164
+ |:-----------------------------:|:---------------------:|:-------------:|:-----------------------:|
165
+ | 3.21 | 1x A100 (40 GB SXM) | torch.bfloat16 | 16 |
166
+
167
+ ## Training
168
+
169
+ It took ~5 hours to train 3 epochs on 1x A100 (40 GB SXM).
170
+
171
+ ### Prompt Format
172
+
173
+ This model was finetuned with the following format:
174
+
175
+ ```python
176
+ tokenizer.chat_template = "{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST] ' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token + ' ' }}{% else %}{{ raise_exception('Only user and assistant roles are supported!') }}{% endif %}{% endfor %}"
177
+ ```
178
+
179
+ This format is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating) via the `apply_chat_template()` method. Here's an illustrative example:
180
+
181
+ ```python
182
+ messages = [
183
+ {"role": "user", "content": "Tell me a recipe for a mai tai."},
184
+ {"role": "assistant", "content": "1 oz light rum\n½ oz dark rum\n¼ oz orange curaçao\n2 oz pineapple juice\n¾ oz lime juice\nDash of orgeat syrup (optional)\nSplash of grenadine (for garnish, optional)\nLime wheel and cherry garnishes (optional)\n\nShake all ingredients except the splash of grenadine in a cocktail shaker over ice. Strain into an old-fashioned glass filled with fresh ice cubes. Gently pour the splash of grenadine down the side of the glass so that it sinks to the bottom. Add garnishes as desired."},
185
+ {"role": "user", "content": "How can I make it more upscale and luxurious?"},
186
+ ]
187
+
188
+ print("\n\n*** Prompt:")
189
+ input_ids = tokenizer.apply_chat_template(
190
+ messages,
191
+ tokenize=True,
192
+ return_tensors="pt",
193
+ )
194
+ print(tokenizer.decode(input_ids[0]))
195
+ ```
196
+
197
+ <details>
198
+
199
+ <summary>Output</summary>
200
+
201
+ ```python
202
+ """<s> [INST] Tell me a recipe for a mai tai. [/INST] 1 oz light rum\n½ oz dark rum\n (...) Add garnishes as desired.</s> [INST] How can I make it more upscale and luxurious? [/INST]"""
203
+ ```
204
+ </details>
205
+
206
+ ### Training Hyperparameters
207
+
208
+
209
+ We use the [SFTTrainer](https://huggingface.co/docs/trl/main/en/sft_trainer) from `trl` to fine-tune LLMs on instruction-following datasets.
210
+
211
+ See [here](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/mistral/sft_Mistral_7B_Instruct_v0_1_peft.ipynb) for the finetuning code, which contains an exhaustive view of the hyperparameters employed.
212
+
213
+ The following `TrainingArguments` config was used:
214
+
215
+ - output_dir = "./results"
216
+ - num_train_epochs = 2
217
+ - auto_find_batch_size = True
218
+ - gradient_accumulation_steps = 2
219
+ - optim = "paged_adamw_32bit"
220
+ - save_strategy = "epoch"
221
+ - learning_rate = 3e-4
222
+ - lr_scheduler_type = "cosine"
223
+ - warmup_ratio = 0.03
224
+ - logging_strategy = "steps"
225
+ - logging_steps = 25
226
+ - evaluation_strategy = "no"
227
+ - bf16 = True
228
+
229
+ The following `bitsandbytes` quantization config was used:
230
+
231
+ - quant_method: bitsandbytes
232
+ - load_in_8bit: False
233
+ - load_in_4bit: True
234
+ - llm_int8_threshold: 6.0
235
+ - llm_int8_skip_modules: None
236
+ - llm_int8_enable_fp32_cpu_offload: False
237
+ - llm_int8_has_fp16_weight: False
238
+ - bnb_4bit_quant_type: nf4
239
+ - bnb_4bit_use_double_quant: False
240
+ - bnb_4bit_compute_dtype: bfloat16
241
+
242
+
243
+ ## Model Card Contact
244
+
245
+ dryanfurman at gmail
246
+
247
+ ## Mistral Research Citation
248
+
249
+ ```
250
+ @misc{jiang2023mistral,
251
+ title={Mistral 7B},
252
+ author={Albert Q. Jiang and Alexandre Sablayrolles and Arthur Mensch and Chris Bamford and Devendra Singh Chaplot and Diego de las Casas and Florian Bressand and Gianna Lengyel and Guillaume Lample and Lucile Saulnier and Lélio Renard Lavaud and Marie-Anne Lachaux and Pierre Stock and Teven Le Scao and Thibaut Lavril and Thomas Wang and Timothée Lacroix and William El Sayed},
253
+ year={2023},
254
+ eprint={2310.06825},
255
+ archivePrefix={arXiv},
256
+ primaryClass={cs.CL}
257
+ }
258
+ ```
259
+
260
+ ## Framework versions
261
+
262
+
263
+ - PEFT 0.6.3.dev0
adapter_config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
5
+ "bias": "none",
6
+ "fan_in_fan_out": false,
7
+ "inference_mode": true,
8
+ "init_lora_weights": true,
9
+ "layers_pattern": null,
10
+ "layers_to_transform": null,
11
+ "lora_alpha": 16,
12
+ "lora_dropout": 0.1,
13
+ "modules_to_save": null,
14
+ "peft_type": "LORA",
15
+ "r": 64,
16
+ "rank_pattern": {},
17
+ "revision": null,
18
+ "target_modules": [
19
+ "q_proj",
20
+ "v_proj",
21
+ "o_proj",
22
+ "k_proj"
23
+ ],
24
+ "task_type": "CAUSAL_LM"
25
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94827fc381f566d4300bfc7261f61a3100960498a190d0680a2d6219ca039194
3
+ size 109086672
logo.png ADDED
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "<unk>",
17
+ "unk_token": {
18
+ "content": "<unk>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dadfd56d766715c61d2ef780a525ab43b8e6da4de6865bda3d95fdef5e134055
3
+ size 493443
tokenizer_config.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": false,
3
+ "add_eos_token": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<unk>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "<s>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "</s>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ }
29
+ },
30
+ "additional_special_tokens": [],
31
+ "bos_token": "<s>",
32
+ "chat_template": "{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST] ' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token + ' ' }}{% else %}{{ raise_exception('Only user and assistant roles are supported!') }}{% endif %}{% endfor %}",
33
+ "clean_up_tokenization_spaces": true,
34
+ "eos_token": "</s>",
35
+ "legacy": true,
36
+ "model_max_length": 1000000000000000019884624838656,
37
+ "pad_token": "<unk>",
38
+ "sp_model_kwargs": {},
39
+ "spaces_between_special_tokens": false,
40
+ "tokenizer_class": "LlamaTokenizer",
41
+ "unk_token": "<unk>",
42
+ "use_default_system_prompt": true
43
+ }