RLHF FAQ

Question 1

What is RLHF?

Accepted Answer

Reinforcement learning from human feedback aligns model behavior with human preferences by training reward models from human comparisons.

Question 2

Which AI concepts are related to RLHF?

Accepted Answer

RLHF is related to 对齐, 强化学习, 大语言模型.

Related Concepts