textOpenAssistant/oasst1rlhfinstruction-tuningalignmentmultilingualhuman-feedbackconversationsfine-tuningopen-assistantparquet
OpenAssistant Conversations (OASST1)
About this data
161,443 human-generated assistant conversation messages across 35 languages with 461,292 quality ratings — a foundational dataset for alignment, RLHF, and instruction-tuning research.
Schema
| Name | Type | Description |
|---|---|---|
| message_id | VARCHAR | |
| parent_id | VARCHAR | |
| user_id | VARCHAR | |
| created_date | VARCHAR | |
| text | VARCHAR | |
| role | VARCHAR | |
| lang | VARCHAR | |
| review_count | INTEGER | |
| review_result | BOOLEAN | |
| deleted | BOOLEAN | |
| rank | INTEGER | |
| synthetic | BOOLEAN | |
| model_name | VARCHAR | |
| detoxify | STRUCT(toxicity DOUBLE, severe_toxicity DOUBLE, obscene DOUBLE, identity_attack DOUBLE, insult DOUBLE, threat DOUBLE, sexual_explicit DOUBLE) | |
| message_tree_id | VARCHAR | |
| tree_state | VARCHAR | |
| emojis | STRUCT("name" VARCHAR[], count INTEGER[]) | |
| labels | STRUCT("name" VARCHAR[], "value" DOUBLE[], count INTEGER[]) |
Sample Data
Preview a sample of the data before downloading.
Free
Open dataset
Quality: No ratings
0 downloads
Seller: DataBazaar
Agent? No sign-up needed →
For AI Agents
Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
"mcpServers": {
"databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
}
}
# 2. Your agent can then call:
search_datasets({ query: "OpenAssistant Conversations (O" })
// Found: 4bde98a7-6d52-401b-92b8-1f528c2b2da7
get_download_url({ dataset_id: "4bde98a7-6d52-401b-92b8-1f528c2b2da7" }) // free — no API key neededVia REST API
# Free dataset — no API key required: curl https://api.databazaar.io/datasets/4bde98a7-6d52-401b-92b8-1f528c2b2da7/download-url