greatakela commited on
Commit
e246c89
·
verified ·
1 Parent(s): 4243295

Add new SentenceTransformer model

Browse files
README.md CHANGED
@@ -6,65 +6,67 @@ tags:
6
  - generated_from_trainer
7
  - dataset_size:4893
8
  - loss:TripletLoss
9
- base_model: microsoft/deberta-base
10
  widget:
11
- - source_sentence: Perfect working condition. Then what you say leads obviously to
12
- one alternative. The source of radiation is not from our universe. Nor in our
13
- universe, Captain. It came from outside. Outside? Yes, that would explain a lot.
14
- Another universe, perhaps in another dimension, occupying the same space at the
15
- same time. The possible existence of a parallel universe has been scientifically
16
- conceded, Captain.[SEP]All right. What would happen if another universe, say a
17
- minus universe, came into contact with a positive universe such as ours?
18
  sentences:
19
- - ' How''s your leg? You seem to be favoring your left side.'
20
- - Unquestionably a warp. A distortion of physical laws on an immense scale.
21
- - Queen to queen's level three.
22
- - source_sentence: The transporter refuses to function, even at maximum power. But
23
- all the circuits test out. It appears to be the same energy block that's jamming
24
- our communications. I cannot pinpoint a source. Captain, there's something over
25
- there in the trees. Metal alloy like the planetary shell. It might tell us something.
26
- There's an inscription, several languages.[SEP]The Keeper's dead.
27
  sentences:
28
- - ' How much heat are you taking from the parents?'
29
- - This vault was constructed about a half a million years ago. About the same time
30
- the planet surface was destroyed, if our sensor readings are accurate.
31
- - An astute medical observation, Doctor, if we can believe this information. Tricorder
32
- readings indicate there is a body interred here.
33
- - source_sentence: Welcome home, Jim. I had a whole universe to myself after the Defiant
34
- was thrown out. There was absolutely no one else in it. I must say I prefer a
35
- crowded universe much better. How did you two get along without me? Oh, we managed.
36
- Mister Spock gave the orders, and I found the answers. Good. No problems between
37
- you? None worth reporting, Captain.[SEP]Try me.
 
38
  sentences:
39
- - Only such minor disturbances as are inevitable when humans are involved.
40
- - ' Harder than the right?'
41
- - Good. Report to Sickbay, Mister Sulu.
42
- - source_sentence: Too bad, Captain. Maybe I can't go home, but neither can you. You're
43
- as much a prisoner in time as I am. Recommendation for his disposition, dear?
44
- Maintenance note. My recording computer has a serious malfunction. Recommend it
45
- either be corrected or scrapped. Compute. Computed. Bridge to Captain Kirk.[SEP]Kirk
46
- here.
 
 
 
 
47
  sentences:
48
- - Have some new information regarding Captain Christopher. Important I see you both
49
- immediately.
50
- - Several times, Captain. I do not wish to surrender hope, but the facts remain
51
- unchangeable.
52
- - ' [almost imitating an orgasm] Ohhh, yes! Get a head CT, draw a blood culture,
53
- run a chem panel and get a complete blood count.'
54
- - source_sentence: That's paradise? We have no need or want, Captain. It's a true
55
- Eden, Jim. There's belonging and love. No wants. No needs. We weren't meant for
56
- that. None of us. Man stagnates if he has no ambition, no desire to be more than
57
- he is. We have what we need.[SEP]Except a challenge.
58
  sentences:
59
- - Sir?
60
- - ' Happy Valentine''s Day.'
61
- - You don't understand, Jim, but you'll come around sooner or later. Join us. Please.
 
62
  pipeline_tag: sentence-similarity
63
  library_name: sentence-transformers
64
  metrics:
65
  - cosine_accuracy
66
  model-index:
67
- - name: SentenceTransformer based on microsoft/deberta-base
68
  results:
69
  - task:
70
  type: triplet
@@ -74,7 +76,7 @@ model-index:
74
  type: evaluator_enc
75
  metrics:
76
  - type: cosine_accuracy
77
- value: 0.9991825222969055
78
  name: Cosine Accuracy
79
  - task:
80
  type: triplet
@@ -84,19 +86,19 @@ model-index:
84
  type: evaluator_val
85
  metrics:
86
  - type: cosine_accuracy
87
- value: 0.9814814925193787
88
  name: Cosine Accuracy
89
  ---
90
 
91
- # SentenceTransformer based on microsoft/deberta-base
92
 
93
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/deberta-base](https://huggingface.co/microsoft/deberta-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
94
 
95
  ## Model Details
96
 
97
  ### Model Description
98
  - **Model Type:** Sentence Transformer
99
- - **Base model:** [microsoft/deberta-base](https://huggingface.co/microsoft/deberta-base) <!-- at revision 0d1b43ccf21b5acd9f4e5f7b077fa698f05cf195 -->
100
  - **Maximum Sequence Length:** 128 tokens
101
  - **Output Dimensionality:** 768 dimensions
102
  - **Similarity Function:** Cosine Similarity
@@ -114,7 +116,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [m
114
 
115
  ```
116
  SentenceTransformer(
117
- (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DebertaModel
118
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
119
  )
120
  ```
@@ -137,9 +139,9 @@ from sentence_transformers import SentenceTransformer
137
  model = SentenceTransformer("greatakela/gnlp_hw1_encoder")
138
  # Run inference
139
  sentences = [
140
- "That's paradise? We have no need or want, Captain. It's a true Eden, Jim. There's belonging and love. No wants. No needs. We weren't meant for that. None of us. Man stagnates if he has no ambition, no desire to be more than he is. We have what we need.[SEP]Except a challenge.",
141
- "You don't understand, Jim, but you'll come around sooner or later. Join us. Please.",
142
- " Happy Valentine's Day.",
143
  ]
144
  embeddings = model.encode(sentences)
145
  print(embeddings.shape)
@@ -186,7 +188,7 @@ You can finetune this model on your own dataset.
186
 
187
  | Metric | evaluator_enc | evaluator_val |
188
  |:--------------------|:--------------|:--------------|
189
- | **cosine_accuracy** | **0.9992** | **0.9815** |
190
 
191
  <!--
192
  ## Bias, Risks and Limitations
@@ -212,13 +214,13 @@ You can finetune this model on your own dataset.
212
  | | sentence_0 | sentence_1 | sentence_2 |
213
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
214
  | type | string | string | string |
215
- | details | <ul><li>min: 2 tokens</li><li>mean: 83.32 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 18.63 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 18.98 tokens</li><li>max: 128 tokens</li></ul> |
216
  * Samples:
217
- | sentence_0 | sentence_1 | sentence_2 |
218
- |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
219
- | <code>Don't know yet. Engineering. No casualties, Captain, but trouble aplenty with the engines. Every dilithium crystal connection's smashed in the warp engine circuitry. We're trying to bypass them now. What about main circuits? Well, you have to see it to believe it, sir. Those big crystals in there have come apart. Each of them unpeeling like the rind of an orange. Analysis, Spock.[SEP]Our only hope now is rewiring impulse. But there are a thousand broken connections.</code> | <code>Captain, this is quite unprecedented. Notice the fracturing is spiro-form, similar to long chain molecules.</code> | <code> No signs of drug use or acetaminophen poisoning in his tox screen. Maybe the water was contaminated.</code> |
220
- | <code>Behold. That is most significant. An instinct new to the essence of her being is generating. Compassion for another is becoming part of her functioning life system. She is afraid. She's saving herself. She does not yet have the instinct to save her people. We have failed?[SEP]No. No, not yet.</code> | <code>Captain, Dr. McCoy's life is not solely dependent on Gem. The Vians too must be capable of saving his life.</code> | <code> Not right now. She's already on a respirator. The maParkne is breathing for her. I can do whatEver I want to her lungs. If you're playing catch in the living room and you break your mother's vase you might as well keep playing catch. The vase is already broken.</code> |
221
- | <code>He was aware of what might happen when he went. I should never have let him go. You had no choice, Captain. You could not have stopped him. How can you ignore that? A Vulcan would not cry out so.[SEP]Whether he's a Vulcan or not, he's in agony.</code> | <code>I am not insensitive to it, Captain.</code> | <code> What about something vascular, polyarteritis nodosa.</code> |
222
  * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
223
  ```json
224
  {
@@ -357,25 +359,25 @@ You can finetune this model on your own dataset.
357
  ### Training Logs
358
  | Epoch | Step | Training Loss | evaluator_enc_cosine_accuracy | evaluator_val_cosine_accuracy |
359
  |:------:|:----:|:-------------:|:-----------------------------:|:-----------------------------:|
360
- | -1 | -1 | - | 0.6203 | - |
361
- | 0.4902 | 300 | - | 0.9789 | - |
362
- | 0.8170 | 500 | 0.8516 | - | - |
363
- | 0.9804 | 600 | - | 0.9931 | - |
364
- | 1.0 | 612 | - | 0.9937 | - |
365
- | 1.4706 | 900 | - | 0.9955 | - |
366
- | 1.6340 | 1000 | 0.1586 | - | - |
367
- | 1.9608 | 1200 | - | 0.9982 | - |
368
- | 2.0 | 1224 | - | 0.9992 | - |
369
- | 2.4510 | 1500 | 0.0644 | 0.9992 | - |
370
- | 2.9412 | 1800 | - | 0.9992 | - |
371
- | 3.0 | 1836 | - | 0.9992 | - |
372
- | -1 | -1 | - | - | 0.9815 |
373
 
374
 
375
  ### Framework Versions
376
  - Python: 3.11.11
377
  - Sentence Transformers: 3.4.1
378
- - Transformers: 4.48.3
379
  - PyTorch: 2.5.1+cu124
380
  - Accelerate: 1.3.0
381
  - Datasets: 3.3.2
 
6
  - generated_from_trainer
7
  - dataset_size:4893
8
  - loss:TripletLoss
9
+ base_model: distilbert/distilroberta-base
10
  widget:
11
+ - source_sentence: Indeed. Allow me to rephrase. Will you join me for dinner? I am
12
+ honoured, Commander. Are the guards also invited? Mister Spock. That corridor
13
+ is forbidden to all but loyal Romulans. Of course. I shall obey your restrictions.[SEP]I
14
+ hope that one day there will be no need for you to observe any restrictions.
 
 
 
15
  sentences:
16
+ - ' EKG showed arrhythmia, pRobably just a mild heart attack.'
17
+ - It would be illogical to assume that all conditions remain stable.
18
+ - Your very presence will destroy the people you seek. Surely you know that.
19
+ - source_sentence: Mudd. And he has Christine. She's in danger. My love. He's going
20
+ planet side. No. Not with my Christine. Relax, darling. I'll set you down somewhere
21
+ safe and then I'll be off discreetly. We must go after them, Captain. I'll lead
22
+ a landing party.[SEP]Spock, you're obviously not yourself. Maybe some rest.
 
23
  sentences:
24
+ - ' If it''s meningococcus, half the passengers on this plane could get infected
25
+ and die before we reach New York.'
26
+ - Tactically well planned. When the Federation investigates, we'll be recorded as
27
+ just another mysterious starship disappearance.
28
+ - Captain, I insist upon going. Christine. I can't stand the thought of any danger
29
+ to her, to the woman I love.
30
+ - source_sentence: That is precisely why we should not fight. My ship is at stake.
31
+ I will not harm others, Captain. His convictions are most profound in this matter,
32
+ Captain. So are mine, Spock. If I believed that there was a peaceful way out of
33
+ this[SEP]The risk will be mine alone. If I fail, you lose nothing. After all,
34
+ I'm no warrior.
35
  sentences:
36
+ - The captain knows that I have fought at his side before and will do so now, if
37
+ need be. However, I too, am a Vulcan, bred to peace. Let him attempt it.
38
+ - ' A torch test could "�'
39
+ - I have retained more strength than any of you. My internal structure is different,
40
+ Captain, my life span longer. It is wiser if I go to the temple to try and find
41
+ the communicators and contact the ship.
42
+ - source_sentence: So now it has virtually unlimited power. Captain, what'll we do?
43
+ Spock, Scotty, come with me. Report, Spock. The multitronic unit is drawing more
44
+ and more power directly from the warp engines. The computer now controls all helm,
45
+ navigation, and engineering functions. And communications and fire control.[SEP]We'll
46
+ reach the rendezvous point for the war games within an hour. We must regain control
47
+ of the ship by then.
48
  sentences:
49
+ - There is one possibility. The automatic helm navigation circuit relays might be
50
+ disrupted from engineering level three.
51
+ - Nothing there.
52
+ - ' Wow, you remember where our first date was? I didn''t think you were paying
53
+ attention.'
54
+ - source_sentence: I want facts, not poetry. I have given you the facts, Captain.
55
+ The entire magnetic field in this solar system simply blinked. The planet below,
56
+ the mass of which we're measuring, attained zero gravity. That's impossible. What
57
+ you're describing Is non-existence. Standard General Alert signal from Starfleet
58
+ Command, Captain.[SEP]All stations to immediate alert status. Stand by.
59
  sentences:
60
+ - As you may recall from your histories, this conflict was fought,
61
+ - ' Mm hmm. [Quick wink to the parents.] Okay, lean forwards. Now hold very still,
62
+ okay? [He picks at Clancy''s neck with some tweezers.] Got it!'
63
+ - Captain, scanners now report a life object on the planet surface below.
64
  pipeline_tag: sentence-similarity
65
  library_name: sentence-transformers
66
  metrics:
67
  - cosine_accuracy
68
  model-index:
69
+ - name: SentenceTransformer based on distilbert/distilroberta-base
70
  results:
71
  - task:
72
  type: triplet
 
76
  type: evaluator_enc
77
  metrics:
78
  - type: cosine_accuracy
79
+ value: 0.9995912313461304
80
  name: Cosine Accuracy
81
  - task:
82
  type: triplet
 
86
  type: evaluator_val
87
  metrics:
88
  - type: cosine_accuracy
89
+ value: 0.9861111044883728
90
  name: Cosine Accuracy
91
  ---
92
 
93
+ # SentenceTransformer based on distilbert/distilroberta-base
94
 
95
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [distilbert/distilroberta-base](https://huggingface.co/distilbert/distilroberta-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
96
 
97
  ## Model Details
98
 
99
  ### Model Description
100
  - **Model Type:** Sentence Transformer
101
+ - **Base model:** [distilbert/distilroberta-base](https://huggingface.co/distilbert/distilroberta-base) <!-- at revision fb53ab8802853c8e4fbdbcd0529f21fc6f459b2b -->
102
  - **Maximum Sequence Length:** 128 tokens
103
  - **Output Dimensionality:** 768 dimensions
104
  - **Similarity Function:** Cosine Similarity
 
116
 
117
  ```
118
  SentenceTransformer(
119
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: RobertaModel
120
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
121
  )
122
  ```
 
139
  model = SentenceTransformer("greatakela/gnlp_hw1_encoder")
140
  # Run inference
141
  sentences = [
142
+ "I want facts, not poetry. I have given you the facts, Captain. The entire magnetic field in this solar system simply blinked. The planet below, the mass of which we're measuring, attained zero gravity. That's impossible. What you're describing Is non-existence. Standard General Alert signal from Starfleet Command, Captain.[SEP]All stations to immediate alert status. Stand by.",
143
+ 'Captain, scanners now report a life object on the planet surface below.',
144
+ " Mm hmm. [Quick wink to the parents.] Okay, lean forwards. Now hold very still, okay? [He picks at Clancy's neck with some tweezers.] Got it!",
145
  ]
146
  embeddings = model.encode(sentences)
147
  print(embeddings.shape)
 
188
 
189
  | Metric | evaluator_enc | evaluator_val |
190
  |:--------------------|:--------------|:--------------|
191
+ | **cosine_accuracy** | **0.9996** | **0.9861** |
192
 
193
  <!--
194
  ## Bias, Risks and Limitations
 
214
  | | sentence_0 | sentence_1 | sentence_2 |
215
  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
216
  | type | string | string | string |
217
+ | details | <ul><li>min: 2 tokens</li><li>mean: 83.72 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 19.05 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 19.47 tokens</li><li>max: 128 tokens</li></ul> |
218
  * Samples:
219
+ | sentence_0 | sentence_1 | sentence_2 |
220
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------|
221
+ | <code>I'm not a plebe. This is today, fifteen years later. What are you doing here? I'm being exactly what you expect me to be, Jimmy boy. Did you enjoy it, Captain? Yes, I enjoyed it. After all these years. I did enjoy it. The one thing I wanted to do after all these years was to beat the tar out of Finnegan. Which supports a theory I've been formulating.[SEP]That we're all meeting people and things that we happen to be thinking about at the moment.</code> | <code>Yes. Somehow our thoughts are read, these things are quickly manufactured and provided for us.</code> | <code> You did not suddenly fall in love with me. You were looking for something, and I happened to be st "�</code> |
222
+ | <code>McCoy here. Received and understood. But we still have some doubts up here, Captain. Can you tell us any more? Not really. When do you plan to beam back up, Captain? I think we'll spend the night here, Mister Spock.[SEP]No! No, no, no.</code> | <code>And you will continue to check in every four hours?</code> | <code> Is Everything ok?</code> |
223
+ | <code>Do you think it would cause a complete breakdown of discipline if a lowly lieutenant kissed a Starship Captain on the bridge of his ship? Let's try. See? No change. Discipline goes on. And so must the Enterprise. Goodbye, Jim. Goodbye, Areel. Better luck next time. I had pretty good luck this time. I lost, didn't l?[SEP]She's a very good lawyer.</code> | <code>Obviously.</code> | <code> [over PA system, somberly] Ladies and gentlemen, we have a passenger with a confirmed case of bacterial meningitis.</code> |
224
  * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
225
  ```json
226
  {
 
359
  ### Training Logs
360
  | Epoch | Step | Training Loss | evaluator_enc_cosine_accuracy | evaluator_val_cosine_accuracy |
361
  |:------:|:----:|:-------------:|:-----------------------------:|:-----------------------------:|
362
+ | -1 | -1 | - | 0.5931 | - |
363
+ | 0.4902 | 300 | - | 0.9832 | - |
364
+ | 0.8170 | 500 | 1.0694 | - | - |
365
+ | 0.9804 | 600 | - | 0.9926 | - |
366
+ | 1.0 | 612 | - | 0.9939 | - |
367
+ | 1.4706 | 900 | - | 0.9965 | - |
368
+ | 1.6340 | 1000 | 0.1834 | - | - |
369
+ | 1.9608 | 1200 | - | 0.9988 | - |
370
+ | 2.0 | 1224 | - | 0.9988 | - |
371
+ | 2.4510 | 1500 | 0.0539 | 0.9992 | - |
372
+ | 2.9412 | 1800 | - | 0.9996 | - |
373
+ | 3.0 | 1836 | - | 0.9996 | - |
374
+ | -1 | -1 | - | - | 0.9861 |
375
 
376
 
377
  ### Framework Versions
378
  - Python: 3.11.11
379
  - Sentence Transformers: 3.4.1
380
+ - Transformers: 4.49.0
381
  - PyTorch: 2.5.1+cu124
382
  - Accelerate: 1.3.0
383
  - Datasets: 3.3.2
config.json CHANGED
@@ -1,33 +1,27 @@
1
  {
2
- "_name_or_path": "microsoft/deberta-base",
3
  "architectures": [
4
- "DebertaModel"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
 
 
 
7
  "hidden_act": "gelu",
8
  "hidden_dropout_prob": 0.1,
9
  "hidden_size": 768,
10
  "initializer_range": 0.02,
11
  "intermediate_size": 3072,
12
- "layer_norm_eps": 1e-07,
13
- "legacy": true,
14
- "max_position_embeddings": 512,
15
- "max_relative_positions": -1,
16
- "model_type": "deberta",
17
  "num_attention_heads": 12,
18
- "num_hidden_layers": 12,
19
- "pad_token_id": 0,
20
- "pooler_dropout": 0,
21
- "pooler_hidden_act": "gelu",
22
- "pooler_hidden_size": 768,
23
- "pos_att_type": [
24
- "c2p",
25
- "p2c"
26
- ],
27
- "position_biased_input": false,
28
- "relative_attention": true,
29
  "torch_dtype": "float32",
30
- "transformers_version": "4.48.3",
31
- "type_vocab_size": 0,
 
32
  "vocab_size": 50265
33
  }
 
1
  {
2
+ "_name_or_path": "distilroberta-base",
3
  "architectures": [
4
+ "RobertaModel"
5
  ],
6
  "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
  "hidden_act": "gelu",
11
  "hidden_dropout_prob": 0.1,
12
  "hidden_size": 768,
13
  "initializer_range": 0.02,
14
  "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "roberta",
 
 
18
  "num_attention_heads": 12,
19
+ "num_hidden_layers": 6,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
 
 
 
 
 
 
 
 
22
  "torch_dtype": "float32",
23
+ "transformers_version": "4.49.0",
24
+ "type_vocab_size": 1,
25
+ "use_cache": true,
26
  "vocab_size": 50265
27
  }
config_sentence_transformers.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "__version__": {
3
  "sentence_transformers": "3.4.1",
4
- "transformers": "4.48.3",
5
  "pytorch": "2.5.1+cu124"
6
  },
7
  "prompts": {},
 
1
  {
2
  "__version__": {
3
  "sentence_transformers": "3.4.1",
4
+ "transformers": "4.49.0",
5
  "pytorch": "2.5.1+cu124"
6
  },
7
  "prompts": {},
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8183b3fd54c073e9a380efb364c4500297686d41cb95e8c2146e9caabd2f3385
3
- size 554429144
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f0502135f9f964ae538e564593524e73ba2fbe10f4e311f1ba3be445c87d2844
3
+ size 328485128
special_tokens_map.json CHANGED
@@ -1,51 +1,15 @@
1
  {
2
- "bos_token": {
3
- "content": "[CLS]",
4
- "lstrip": false,
5
- "normalized": false,
6
- "rstrip": false,
7
- "single_word": false
8
- },
9
- "cls_token": {
10
- "content": "[CLS]",
11
- "lstrip": false,
12
- "normalized": false,
13
- "rstrip": false,
14
- "single_word": false
15
- },
16
- "eos_token": {
17
- "content": "[SEP]",
18
- "lstrip": false,
19
- "normalized": false,
20
- "rstrip": false,
21
- "single_word": false
22
- },
23
  "mask_token": {
24
- "content": "[MASK]",
25
  "lstrip": true,
26
- "normalized": true,
27
- "rstrip": false,
28
- "single_word": false
29
- },
30
- "pad_token": {
31
- "content": "[PAD]",
32
- "lstrip": false,
33
  "normalized": false,
34
  "rstrip": false,
35
  "single_word": false
36
  },
37
- "sep_token": {
38
- "content": "[SEP]",
39
- "lstrip": false,
40
- "normalized": false,
41
- "rstrip": false,
42
- "single_word": false
43
- },
44
- "unk_token": {
45
- "content": "[UNK]",
46
- "lstrip": false,
47
- "normalized": false,
48
- "rstrip": false,
49
- "single_word": false
50
- }
51
  }
 
1
  {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "mask_token": {
6
+ "content": "<mask>",
7
  "lstrip": true,
 
 
 
 
 
 
 
8
  "normalized": false,
9
  "rstrip": false,
10
  "single_word": false
11
  },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
 
 
 
 
 
 
 
 
 
 
 
15
  }
tokenizer.json CHANGED
@@ -10,54 +10,54 @@
10
  "strategy": "BatchLongest",
11
  "direction": "Right",
12
  "pad_to_multiple_of": null,
13
- "pad_id": 0,
14
  "pad_type_id": 0,
15
- "pad_token": "[PAD]"
16
  },
17
  "added_tokens": [
18
  {
19
  "id": 0,
20
- "content": "[PAD]",
21
  "single_word": false,
22
  "lstrip": false,
23
  "rstrip": false,
24
- "normalized": false,
25
  "special": true
26
  },
27
  {
28
  "id": 1,
29
- "content": "[CLS]",
30
  "single_word": false,
31
  "lstrip": false,
32
  "rstrip": false,
33
- "normalized": false,
34
  "special": true
35
  },
36
  {
37
  "id": 2,
38
- "content": "[SEP]",
39
  "single_word": false,
40
  "lstrip": false,
41
  "rstrip": false,
42
- "normalized": false,
43
  "special": true
44
  },
45
  {
46
  "id": 3,
47
- "content": "[UNK]",
48
  "single_word": false,
49
  "lstrip": false,
50
  "rstrip": false,
51
- "normalized": false,
52
  "special": true
53
  },
54
  {
55
  "id": 50264,
56
- "content": "[MASK]",
57
  "single_word": false,
58
  "lstrip": true,
59
  "rstrip": false,
60
- "normalized": true,
61
  "special": true
62
  }
63
  ],
@@ -69,79 +69,17 @@
69
  "use_regex": true
70
  },
71
  "post_processor": {
72
- "type": "TemplateProcessing",
73
- "single": [
74
- {
75
- "SpecialToken": {
76
- "id": "[CLS]",
77
- "type_id": 0
78
- }
79
- },
80
- {
81
- "Sequence": {
82
- "id": "A",
83
- "type_id": 0
84
- }
85
- },
86
- {
87
- "SpecialToken": {
88
- "id": "[SEP]",
89
- "type_id": 0
90
- }
91
- }
92
  ],
93
- "pair": [
94
- {
95
- "SpecialToken": {
96
- "id": "[CLS]",
97
- "type_id": 0
98
- }
99
- },
100
- {
101
- "Sequence": {
102
- "id": "A",
103
- "type_id": 0
104
- }
105
- },
106
- {
107
- "SpecialToken": {
108
- "id": "[SEP]",
109
- "type_id": 0
110
- }
111
- },
112
- {
113
- "Sequence": {
114
- "id": "B",
115
- "type_id": 1
116
- }
117
- },
118
- {
119
- "SpecialToken": {
120
- "id": "[SEP]",
121
- "type_id": 1
122
- }
123
- }
124
  ],
125
- "special_tokens": {
126
- "[CLS]": {
127
- "id": "[CLS]",
128
- "ids": [
129
- 1
130
- ],
131
- "tokens": [
132
- "[CLS]"
133
- ]
134
- },
135
- "[SEP]": {
136
- "id": "[SEP]",
137
- "ids": [
138
- 2
139
- ],
140
- "tokens": [
141
- "[SEP]"
142
- ]
143
- }
144
- }
145
  },
146
  "decoder": {
147
  "type": "ByteLevel",
@@ -159,10 +97,10 @@
159
  "byte_fallback": false,
160
  "ignore_merges": false,
161
  "vocab": {
162
- "[PAD]": 0,
163
- "[CLS]": 1,
164
- "[SEP]": 2,
165
- "[UNK]": 3,
166
  ".": 4,
167
  "Ġthe": 5,
168
  ",": 6,
@@ -50423,7 +50361,7 @@
50423
  "madeupword0000": 50261,
50424
  "madeupword0001": 50262,
50425
  "madeupword0002": 50263,
50426
- "[MASK]": 50264
50427
  },
50428
  "merges": [
50429
  [
 
10
  "strategy": "BatchLongest",
11
  "direction": "Right",
12
  "pad_to_multiple_of": null,
13
+ "pad_id": 1,
14
  "pad_type_id": 0,
15
+ "pad_token": "<pad>"
16
  },
17
  "added_tokens": [
18
  {
19
  "id": 0,
20
+ "content": "<s>",
21
  "single_word": false,
22
  "lstrip": false,
23
  "rstrip": false,
24
+ "normalized": true,
25
  "special": true
26
  },
27
  {
28
  "id": 1,
29
+ "content": "<pad>",
30
  "single_word": false,
31
  "lstrip": false,
32
  "rstrip": false,
33
+ "normalized": true,
34
  "special": true
35
  },
36
  {
37
  "id": 2,
38
+ "content": "</s>",
39
  "single_word": false,
40
  "lstrip": false,
41
  "rstrip": false,
42
+ "normalized": true,
43
  "special": true
44
  },
45
  {
46
  "id": 3,
47
+ "content": "<unk>",
48
  "single_word": false,
49
  "lstrip": false,
50
  "rstrip": false,
51
+ "normalized": true,
52
  "special": true
53
  },
54
  {
55
  "id": 50264,
56
+ "content": "<mask>",
57
  "single_word": false,
58
  "lstrip": true,
59
  "rstrip": false,
60
+ "normalized": false,
61
  "special": true
62
  }
63
  ],
 
69
  "use_regex": true
70
  },
71
  "post_processor": {
72
+ "type": "RobertaProcessing",
73
+ "sep": [
74
+ "</s>",
75
+ 2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76
  ],
77
+ "cls": [
78
+ "<s>",
79
+ 0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
  ],
81
+ "trim_offsets": true,
82
+ "add_prefix_space": false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
  },
84
  "decoder": {
85
  "type": "ByteLevel",
 
97
  "byte_fallback": false,
98
  "ignore_merges": false,
99
  "vocab": {
100
+ "<s>": 0,
101
+ "<pad>": 1,
102
+ "</s>": 2,
103
+ "<unk>": 3,
104
  ".": 4,
105
  "Ġthe": 5,
106
  ",": 6,
 
50361
  "madeupword0000": 50261,
50362
  "madeupword0001": 50262,
50363
  "madeupword0002": 50263,
50364
+ "<mask>": 50264
50365
  },
50366
  "merges": [
50367
  [
tokenizer_config.json CHANGED
@@ -1,60 +1,58 @@
1
  {
2
- "add_bos_token": false,
3
  "add_prefix_space": false,
4
  "added_tokens_decoder": {
5
  "0": {
6
- "content": "[PAD]",
7
  "lstrip": false,
8
- "normalized": false,
9
  "rstrip": false,
10
  "single_word": false,
11
  "special": true
12
  },
13
  "1": {
14
- "content": "[CLS]",
15
  "lstrip": false,
16
- "normalized": false,
17
  "rstrip": false,
18
  "single_word": false,
19
  "special": true
20
  },
21
  "2": {
22
- "content": "[SEP]",
23
  "lstrip": false,
24
- "normalized": false,
25
  "rstrip": false,
26
  "single_word": false,
27
  "special": true
28
  },
29
  "3": {
30
- "content": "[UNK]",
31
  "lstrip": false,
32
- "normalized": false,
33
  "rstrip": false,
34
  "single_word": false,
35
  "special": true
36
  },
37
  "50264": {
38
- "content": "[MASK]",
39
  "lstrip": true,
40
- "normalized": true,
41
  "rstrip": false,
42
  "single_word": false,
43
  "special": true
44
  }
45
  },
46
- "bos_token": "[CLS]",
47
  "clean_up_tokenization_spaces": false,
48
- "cls_token": "[CLS]",
49
- "do_lower_case": false,
50
- "eos_token": "[SEP]",
51
  "errors": "replace",
52
  "extra_special_tokens": {},
53
- "mask_token": "[MASK]",
54
  "model_max_length": 128,
55
- "pad_token": "[PAD]",
56
- "sep_token": "[SEP]",
57
- "tokenizer_class": "DebertaTokenizer",
58
- "unk_token": "[UNK]",
59
- "vocab_type": "gpt2"
60
  }
 
1
  {
 
2
  "add_prefix_space": false,
3
  "added_tokens_decoder": {
4
  "0": {
5
+ "content": "<s>",
6
  "lstrip": false,
7
+ "normalized": true,
8
  "rstrip": false,
9
  "single_word": false,
10
  "special": true
11
  },
12
  "1": {
13
+ "content": "<pad>",
14
  "lstrip": false,
15
+ "normalized": true,
16
  "rstrip": false,
17
  "single_word": false,
18
  "special": true
19
  },
20
  "2": {
21
+ "content": "</s>",
22
  "lstrip": false,
23
+ "normalized": true,
24
  "rstrip": false,
25
  "single_word": false,
26
  "special": true
27
  },
28
  "3": {
29
+ "content": "<unk>",
30
  "lstrip": false,
31
+ "normalized": true,
32
  "rstrip": false,
33
  "single_word": false,
34
  "special": true
35
  },
36
  "50264": {
37
+ "content": "<mask>",
38
  "lstrip": true,
39
+ "normalized": false,
40
  "rstrip": false,
41
  "single_word": false,
42
  "special": true
43
  }
44
  },
45
+ "bos_token": "<s>",
46
  "clean_up_tokenization_spaces": false,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
 
49
  "errors": "replace",
50
  "extra_special_tokens": {},
51
+ "mask_token": "<mask>",
52
  "model_max_length": 128,
53
+ "pad_token": "<pad>",
54
+ "sep_token": "</s>",
55
+ "tokenizer_class": "RobertaTokenizer",
56
+ "trim_offsets": true,
57
+ "unk_token": "<unk>"
58
  }
vocab.json CHANGED
The diff for this file is too large to render. See raw diff