Jan 30, 2025
> Responses better than the group’s average get positive scores, while worse responses get negative scores.
Better in what sense? You're missing the foundation of the objective function. Is there ground truth each member of each group is being compared against?