rin2401
rin2401
AI & ML interests
None yet
Recent Activity
updated a collection 4 days ago
Safety updated a collection 8 days ago
Safety updated a collection 8 days ago
SafetyOrganizations
Safety
-
yiting/UnsafeBench
Viewer • Updated • 10.1k • 541 • 20 -
Zonghao2025/safebench
Preview • Updated • 442 • 5 -
A Holistic Approach to Undesired Content Detection in the Real World
Paper • 2208.03274 • Published -
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
Paper • 2504.04377 • Published • 1
Agent
DPO
Aya
-
Multilingual Arbitrage: Optimizing Data Pools to Accelerate Multilingual Progress
Paper • 2408.14960 • Published • 1 -
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
Paper • 2407.02552 • Published • 5 -
The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm
Paper • 2406.18682 • Published • 1 -
Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning
Paper • 2410.10801 • Published • 4
PEFT
Guardrails
Safety
-
yiting/UnsafeBench
Viewer • Updated • 10.1k • 541 • 20 -
Zonghao2025/safebench
Preview • Updated • 442 • 5 -
A Holistic Approach to Undesired Content Detection in the Real World
Paper • 2208.03274 • Published -
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
Paper • 2504.04377 • Published • 1
Think
Agent
Tokenizer
DPO
Fewshot
Aya
-
Multilingual Arbitrage: Optimizing Data Pools to Accelerate Multilingual Progress
Paper • 2408.14960 • Published • 1 -
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
Paper • 2407.02552 • Published • 5 -
The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm
Paper • 2406.18682 • Published • 1 -
Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning
Paper • 2410.10801 • Published • 4
Benchmark
PEFT
LLM