Tether Multilabel Abuse Detection (v4)
This model is part of the Tether project — an AI-driven tool designed to identify emotional abuse patterns in text communication, including gaslighting, control, insults, projection, and more. It is built for use in survivor-facing tools, clinician review workflows, and law enforcement risk triage pilots.
🧠 Model Overview
- Architecture: RoBERTa-base + multi-label classification head
- Trained on: ~2,000 labeled abuse/non-abuse message examples
- Labels (12 total):
blame shifting
contradictory statements
control
dismissiveness
gaslighting
guilt tripping
insults
obscure language
projection
recovery phase
nonabusive
is_from_me
(optional metadata, not always used in deployment)
🧪 Performance (Eval Set)
Label | F1 Score |
---|---|
blame shifting | 0.84 |
contradictory statements | 0.46 |
control | 0.75 |
dismissiveness | 0.68 |
gaslighting | 0.56 |
guilt tripping | 0.62 |
insults | 0.71 |
obscure language | 0.66 |
projection | 0.81 |
recovery phase | 0.55 |
nonabusive | 0.54 |
is_from_me | 0.84 |
Macro F1: 0.67
Samples avg F1: 0.66
🛡️ Intended Use
This model supports:
- Real-time abuse detection in chat/text-based systems
- Therapist/case worker reflection tools
- Risk triage for domestic violence investigations
- Educational applications for identifying coercive or emotionally abusive behavior
It is not a replacement for legal judgement or clinical diagnosis.
⚠️ Known Limitations
- May underperform with:
- Highly poetic/metaphorical language
- Sarcasm and irony
- Non-English texts
gaslighting
andrecovery phase
labels show moderate performance and may require human review- The
is_from_me
label is metadata used for internal modeling but may be excluded from production use
🧩 Technical Details
- Fine-tuned using
BCEWithLogitsLoss
+ per-labelpos_weight
- Thresholds per label selected using macro-F1 sweep (0.1–0.9)
- Temperature scaling applied for probability calibration
- Inference latency: ~XX ms/message (on GPU) (you can fill this in)
💬 Example Inference
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model = AutoModelForSequenceClassification.from_pretrained("SamanthaStorm/tether-multilabel-v4")
tokenizer = AutoTokenizer.from_pretrained("SamanthaStorm/tether-multilabel-v4")
text = "You're making things up again — I never said that."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.sigmoid(logits)
print(probs)
- Downloads last month
- 770
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support