Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Scale Safety Research
Enterprise
community
Activity Feed
Follow
9
AI & ML interests
None defined yet.
Recent Activity
dpaleka
authored
a paper
2 days ago
Pitfalls in Evaluating Language Model Forecasters
abhayesian
updated
a collection
8 days ago
Alignment Faking Datasets
abhayesian
updated
a collection
8 days ago
Alignment Faking Datasets
View all activity
Team members
8
scale-safety-research
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Articles
dpaleka
authored
a paper
2 days ago
Pitfalls in Evaluating Language Model Forecasters
Paper
•
2506.00723
•
Published
5 days ago
•
3
abhayesian
updated
a collection
8 days ago
Alignment Faking Datasets
Collection
11 items
•
Updated
8 days ago
abhayesian
updated
a collection
about 2 months ago
Gemma 2 9b Emergent Misalignment
Collection
6 items
•
Updated
Apr 16
abhayesian
updated
a dataset
2 months ago
scale-safety-research/new_rlhf_not_purely_good_docs
Viewer
•
Updated
Mar 27
•
13.6k
•
23
Load more