The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
conference contribution
posted on 2024-11-20, 03:22authored byN Li, A Pan, A Gopal, S Yue, D Berrios, A Gatti, JD Li, AK Dombrowski, S Goel, G Mukobi, N Helm-Burger, R Lababidi, L Justen, AB Liu, M Chen, I Barrass, O Zhang, X Zhu, R Tamirisa, B Bharathi, A Herbert-Voss, CB Breuer, A Zou, M Mazeika, Z Wang, P Oswal, W Lin, AA Hunt, J Tienken-Harder, KY Shih, K Talley, J Guan, I Steneker, D Campbell, B Jokubaitis, S Basart, S Fitz, P Kumaraguru, KK Karmakar, U Tupakula, V Varadharajan, Y Shoshitaishvili, J Ba, KM Esvelt, A Wang, D Hendrycks
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
History
Volume
235
Pagination
1-26
Location
Vienna, Austria
Open access
No
Start date
2024-07-21
End date
2024-07-27
eISSN
2640-3498
Language
eng
Publication classification
E1.1 Full written paper - refereed
Title of proceedings
PMLR 2024 : Proceedings of the 41st International Conference on Machine Learning