WMDP Benchmark - a cais Collection

cais 's Collections

HarmBench Classifiers

WMDP Benchmark

updated Mar 2

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning