RLHF aligns models like ChatGPT with human preferences using feedback-driven reinforcement learning.
« Back to Glossary Index
« Back to Glossary Index
RLHF aligns models like ChatGPT with human preferences using feedback-driven reinforcement learning.
« Back to Glossary Index