RLHF

ai-ml

Reinforcement Learning from Human Feedback, a training technique used to align language models with human preferences

Pronunciation

Correct

R-L-H-F

/ɑːr ɛl eɪtʃ ɛf/

Spelled out letter by letter as an initialism. There is no accepted word-form pronunciation.

Source: arxiv.org(official spec)