textnvidia/HelpSteer3rlhfpreference-datareward-modelingalignmentmultilingualnvidiahuman-feedbackllm-trainingdpofine-tuning
HelpSteer3 — NVIDIA Human Preference & Feedback Dataset for RLHF/Reward Modeling
About this data
NVIDIA's open-source multilingual preference dataset for training reward models and aligning LLMs. CC-BY-4.0, 100K+ samples across 15 languages, used to train SOTA reward models on RM-Bench (85.5%) and JudgeBench (78.6%).
Schema
| Name | Type | Description |
|---|---|---|
| domain | VARCHAR | Task category or subject area (e.g., general, code, writing, reasoning) |
| language | VARCHAR | ISO 639-1 language code or full language name of the sample |
| context | STRUCT("role" VARCHAR, "content" VARCHAR)[] | Multi-turn conversation history with role (user/assistant) and content pairs |
| original_response | VARCHAR | Unedited model response serving as baseline for comparison |
| good_edited_response | VARCHAR | High-quality edited version of the original response |
| bad_edited_response | VARCHAR | Low-quality edited version of the original response |
| feedback | VARCHAR[] | Array of free-text annotator comments and critiques on responses |
Sample Data
Preview a sample of the data before downloading.
Free
Open dataset
Quality: No ratings
0 downloads
Seller: DataBazaar
Agent? No sign-up needed →
For AI Agents
Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
"mcpServers": {
"databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
}
}
# 2. Your agent can then call:
search_datasets({ query: "HelpSteer3 — NVIDIA Human Pref" })
// Found: 6a6db903-3d27-4ee2-87ca-503ce5221dbb
get_download_url({ dataset_id: "6a6db903-3d27-4ee2-87ca-503ce5221dbb" }) // free — no API key neededVia REST API
# Free dataset — no API key required: curl https://api.databazaar.io/datasets/6a6db903-3d27-4ee2-87ca-503ce5221dbb/download-url