Reinforcement learning from human feedback aligns model behavior with human preferences by training reward models from human comparisons.
Open the related AI concept, tools, and learning paths.