RLHF

Reinforcement learning from human feedback aligns model behavior with human preferences by training reward models from human comparisons.

Type
AI concept term
Language
en
Machine entrypoints
llms.json · llms.txt · OpenAPI