1-person RLHF

10

If we train a value function on person's preferences, does an RLHF'd model still become annoyingly verbose/bland/etc.? How much is the 'all AI sounds the same problem' pretraining vs. SFT vs. RLHF?

llms persona personalization

Chat

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{holtzman-1person-rlhf-2026,
  author = {Holtzman, Ari},
  title = {1-person RLHF},
  year = {2026},
  url = {https://hypogenic.ai/ideahub/idea/tR0POffUc3NV1zOIXq9d}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!