Xiaobo Wang's picture

Xiaobo Wang

Yofuria

·

https://yofuria.github.io/

Yofuria

AI & ML interests

Reward Modeling, Agent Memory, LLM Alignment

Recent Activity

upvoted a paper about 19 hours ago

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives

updated a collection about 19 hours ago

updated a collection about 19 hours ago

View all activity

Organizations

upvoted a paper about 19 hours ago

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives

Paper • 2505.19558 • Published May 26, 2025 • 1

updated 4 collections about 19 hours ago

ICE

In-Context Editing: Learning Knowledge from Self-Induced Distributions • 2 items • Updated about 19 hours ago

PoliCon

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives • 2 items • Updated about 19 hours ago

UAPO

Adaptive Preference Optimization with Uncertainty-aware Utility Anchor • 4 items • Updated about 19 hours ago

SAVE

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement • 4 items • Updated about 19 hours ago

updated a collection 8 days ago

UAPO

Adaptive Preference Optimization with Uncertainty-aware Utility Anchor • 4 items • Updated about 19 hours ago

authored a paper 8 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published 11 days ago • 10

updated 2 collections 8 days ago

ICE

In-Context Editing: Learning Knowledge from Self-Induced Distributions • 2 items • Updated about 19 hours ago

SAVE

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement • 4 items • Updated about 19 hours ago

upvoted a paper 8 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published 11 days ago • 10

submitted a paper to Daily Papers 8 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published 11 days ago • 10

updated a dataset 28 days ago

Yofuria/UltraFeedback-binarized-ms-swift-1024

Viewer • Updated 28 days ago • 38.9k • 63

published a dataset 28 days ago

Yofuria/UltraFeedback-binarized-ms-swift-1024

Viewer • Updated 28 days ago • 38.9k • 63

updated a dataset about 1 month ago

Yofuria/UltraFeedback-ms-swift-1024

Viewer • Updated Apr 27 • 41k • 79

published a dataset about 1 month ago

Yofuria/UltraFeedback-ms-swift-1024

Viewer • Updated Apr 27 • 41k • 79

updated a collection about 2 months ago

PoliCon

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives • 2 items • Updated about 19 hours ago