textopen-thoughts/AgentTroveagentsagentic-tracescodereinforcement-learningfine-tuningtool-usesftapache-2.0
AgentTrove: 1.7M Agentic Interaction Traces
About this data
Largest open-source collection of agentic interaction traces (1.7M rows from 219 source datasets) covering code repair, shell scripting, math, competitive programming, and computer-use tasks. Apache 2.0.
Schema
| Name | Type | Description |
|---|---|---|
| conversations | STRUCT("content" VARCHAR, "role" VARCHAR)[] | Multi-turn agent-environment interaction with content (string) and role (agent/user/system) pairs |
| agent | VARCHAR | Agent identifier or name executing the task |
| model | VARCHAR | Language model name used by the agent |
| date | VARCHAR | Timestamp or date when the trace was recorded |
| task | VARCHAR | Original task prompt or instruction text |
| episode | VARCHAR | Episode identifier for grouped interactions |
| run_id | VARCHAR | Unique identifier for a single execution run |
| trial_name | VARCHAR | Name or label for the trial/experiment |
| model_provider | VARCHAR | Provider of the language model (e.g., OpenAI, Anthropic) |
| original_source | VARCHAR | Source dataset or origin of the task (e.g., exp_rpt) |
| original_teacher | VARCHAR | Teacher model or reference model that generated the trace |
| result | VARCHAR | Result status or outcome of the agent's execution |
| trace_source | VARCHAR | Source system or framework that generated the trace |
| path | VARCHAR | File path or identifier for the task resource |
| task_binary | BLOB | Gzip-compressed binary encoding of task data |
| instruction | VARCHAR | Detailed instruction or prompt given to the agent |
| verifier_output | VARCHAR | Output from verification/evaluation of agent's solution |
| ground_truth | VARCHAR | Expected correct solution or reference output |
| judgment | VARCHAR | Human or automated judgment of solution correctness |
| __index_level_0__ | BIGINT | Zero-indexed row number in the original dataset |
| split | VARCHAR | Dataset partition label (train/val/test) |
Sample Data
Preview a sample of the data before downloading.
Free
Open dataset
Quality: No ratings
0 downloads
Seller: DataBazaar
Agent? No sign-up needed →
For AI Agents
Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
"mcpServers": {
"databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
}
}
# 2. Your agent can then call:
search_datasets({ query: "AgentTrove: 1.7M Agentic Inter" })
// Found: 1d5b9795-bb25-4713-bb89-7975e43c7f9c
get_download_url({ dataset_id: "1d5b9795-bb25-4713-bb89-7975e43c7f9c" }) // free — no API key neededVia REST API
# Free dataset — no API key required: curl https://api.databazaar.io/datasets/1d5b9795-bb25-4713-bb89-7975e43c7f9c/download-url