openai/summarize_from_feedback
Viewer • Updated • 194k • 2.33k • 220
How to use theblackcat102/electra-large-reward-model with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="theblackcat102/electra-large-reward-model") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("theblackcat102/electra-large-reward-model")
model = AutoModelForSequenceClassification.from_pretrained("theblackcat102/electra-large-reward-model")Reward Model pretrained on openai/webgpt_comparison and humanfeedback summary. Unlike the other electra-large model this model is trained using rank loss with one more datasets.
On validation dataset the result is much more stable than usual.
You can refer to this wandb for more details
Slightly better than previous webgpt only model : electra-large