textopen-thoughts/TaskTroveagentic-tasksagent-trainingswe-benchreinforcement-learningcodesftrlopen-thoughtsharborapache-2.0
TaskTrove — 750K+ Agentic Tasks for RL & SFT Training
About this data
Open-source collection of 750,000+ unique agentic tasks aggregated from 100+ sources including SWE-Smith, R2EGym, and SWE-Re-Bench. Apache-2.0 licensed, parquet format, designed for agent training and evaluation.
Schema
| Name | Type | Description |
|---|---|---|
| path | VARCHAR | Unique identifier or file path reference for the task record within the dataset. |
| task_binary | BLOB | Gzip-compressed binary blob containing serialized task data (instruction, repo, tests, solution, metadata). |
Sample Data
Preview a sample of the data before downloading.
Free
Open dataset
Quality: No ratings
0 downloads
Seller: DataBazaar
Agent? No sign-up needed →
For AI Agents
Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
"mcpServers": {
"databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
}
}
# 2. Your agent can then call:
search_datasets({ query: "TaskTrove — 750K+ Agentic Task" })
// Found: 7b8ec43b-d989-4c76-a2a7-2e1f0fe3ebb0
get_download_url({ dataset_id: "7b8ec43b-d989-4c76-a2a7-2e1f0fe3ebb0" }) // free — no API key neededVia REST API
# Free dataset — no API key required: curl https://api.databazaar.io/datasets/7b8ec43b-d989-4c76-a2a7-2e1f0fe3ebb0/download-url