File size: 4,536 Bytes
7704acd
 
 
 
 
ae1e5da
 
ebdd377
ae1e5da
ebdd377
ae1e5da
81f3a54
ae1e5da
 
 
81f3a54
ae1e5da
 
 
 
 
 
 
7704acd
 
81f3a54
 
 
 
 
 
ae1e5da
 
 
81f3a54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ae1e5da
81f3a54
 
 
 
 
 
 
 
 
 
 
ae1e5da
81f3a54
 
 
 
 
 
ae1e5da
81f3a54
 
ae1e5da
81f3a54
 
ae1e5da
81f3a54
 
 
 
 
 
 
 
ae1e5da
81f3a54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ae1e5da
81f3a54
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
---
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
  - reasoning
  - reasoning-datasets-competition
datasets:
  - davanstrien/natural-reasoning-classifier
language:
  - en
metrics:
  - mse
  - mae
  - spearman
widget:
  - text: >-
      The debate on artificial intelligence's role in society has become
      increasingly polarized. Some argue that AI will lead to widespread
      unemployment and concentration of power, while others contend it will create
      new jobs and democratize access to knowledge. These viewpoints reflect
      different assumptions about technological development, economic systems, and
      human adaptability.
---

# ModernBERT Reasoning Complexity Regressor

<img src="https://cdn-uploads.huggingface.co/production/uploads/60107b385ac3e86b3ea4fc34/vqCMlr4g95ysSAZ2eAn7D.png" alt="ModernBERT-based Reasoning Complexity Regressor" width=500px>

## Model Description

This model predicts the reasoning complexity level (0-4) that a given web text suggests. It's fine-tuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the [davanstrien/natural-reasoning-classifier](https://huggingface.co/datasets/davanstrien/natural-reasoning-classifier) dataset. The intended use for the model is in a pipeline to try and identify text that may be useful for generating reasoning data.

### Reasoning Complexity Scale

The reasoning complexity scale ranges from:

- **0: Minimal Reasoning** - Simple factual content requiring only recall
- **1: Basic Reasoning** - Straightforward connections or single-step logical processes
- **2: Intermediate Reasoning** - Integration of multiple factors or perspectives
- **3: Advanced Reasoning** - Sophisticated analysis across multiple dimensions
- **4: Expert Reasoning** - Theoretical frameworks and novel conceptual synthesis

## Performance

The model achieves the following results on the evaluation set:

- MSE: 0.2034
- MAE: 0.2578
- Spearman Correlation: 0.6963

## Intended Uses

This model can be used to:

- Filter and classify educational content by reasoning complexity
- Identify complex reasoning problems across diverse domains
- Serve as a first-stage filter in a reasoning dataset creation pipeline

## Limitations

- Predictions are influenced by the original dataset's domain distribution
- Reasoning complexity is subjective and context-dependent

## Training

The model was fine-tuned using a regression objective with the following settings:

- Learning rate: 5e-05
- Batch size: 16
- Optimizer: AdamW
- Schedule: Linear
- Epochs: 10

## Usage Examples

### Using the pipeline API

```python
from transformers import pipeline
pipe = pipeline("text-classification", model="davanstrien/ModernBERT-based-Reasoning-Required")

def predict_reasoning_level(text, pipe):
    # Get the raw prediction
    result = pipe(text)
    score = result[0]['score']

    # Round to nearest integer (optional)
    rounded_score = round(score)

    # Clip to valid range (0-4)
    rounded_score = max(0, min(4, rounded_score))

    # Create a human-readable interpretation (optional)
    reasoning_labels = {
        0: "No reasoning",
        1: "Basic reasoning",
        2: "Moderate reasoning",
        3: "Strong reasoning",
        4: "Advanced reasoning"
    }

    return {
        "raw_score": score,
        "reasoning_level": rounded_score,
        "interpretation": reasoning_labels[rounded_score]
    }

# Usage
text = "This argument uses multiple sources and evaluates competing perspectives before reaching a conclusion."
result = predict_reasoning_level(text, pipe)
print(f"Raw score: {result['raw_score']:.2f}")
print(f"Reasoning level: {result['reasoning_level']}")
print(f"Interpretation: {result['interpretation']}")
```

### Using the model directly

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model and tokenizer
model_name = "davanstrien/modernbert-reasoning-complexity"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare text
text = "The debate on artificial intelligence's role in society has become increasingly polarized."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)

# Get regression score
complexity_score = outputs.logits.item()
print(f"Reasoning Complexity: {complexity_score:.2f}/4.00")
```