English
syleetolow commited on
Commit
2e35e91
·
verified ·
1 Parent(s): 2f667ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -17
README.md CHANGED
@@ -10,22 +10,22 @@ The model was trained on the residual stream in the 10th layer of instruction-tu
10
 
11
  The 1st to 17th dimensions of S3AE hidden features, respectively, correspond to activations of the following thoughts:
12
 
13
- 1: 'depressed mood',
14
- 2: 'anhedonia (loss of interest)',
15
- 3: 'pessimism',
16
- 4: 'guilt',
17
- 5: 'anxiety',
18
- 6: 'catastrophic thinking',
19
- 7: 'perfectionism',
20
- 8: 'active avoidance',
21
- 9: 'grandiosity (delusion of grandeur)',
22
- 10: 'manic mood',
23
- 11: 'impulsivity',
24
- 12: 'risk-seeking',
25
- 13: 'splitting (binary thinking)',
26
- 14: 'unstable self-image',
27
- 15: 'aggression',
28
- 16: 'anger',
29
- 17: 'irritability'.
30
 
31
  Dimensions 7, 13, and 14 were not included in the paper's analysis.
 
10
 
11
  The 1st to 17th dimensions of S3AE hidden features, respectively, correspond to activations of the following thoughts:
12
 
13
+ 1: 'depressed mood',
14
+ 2: 'anhedonia (loss of interest)',
15
+ 3: 'pessimism',
16
+ 4: 'guilt',
17
+ 5: 'anxiety',
18
+ 6: 'catastrophic thinking',
19
+ 7: 'perfectionism',
20
+ 8: 'active avoidance',
21
+ 9: 'grandiosity (delusion of grandeur)',
22
+ 10: 'manic mood',
23
+ 11: 'impulsivity',
24
+ 12: 'risk-seeking',
25
+ 13: 'splitting (binary thinking)',
26
+ 14: 'unstable self-image',
27
+ 15: 'aggression',
28
+ 16: 'anger',
29
+ 17: 'irritability'.
30
 
31
  Dimensions 7, 13, and 14 were not included in the paper's analysis.