textopen-thoughts/AgentTroveagentsagentic-tracescodereinforcement-learningfine-tuningtool-usesftapache-2.0

AgentTrove: 1.7M Agentic Interaction Traces

Name: AgentTrove: 1.7M Agentic Interaction Traces
Creator: DataBazaar
Keywords: open-thoughts/AgentTrove, agents, agentic-traces, code, reinforcement-learning, fine-tuning, tool-use, sft, apache-2.0

About this data

Largest open-source collection of agentic interaction traces (1.7M rows from 219 source datasets) covering code repair, shell scripting, math, competitive programming, and computer-use tasks. Apache 2.0.

Schema

Name	Type	Description
conversations	STRUCT("content" VARCHAR, "role" VARCHAR)[]	Multi-turn agent-environment interaction with content (string) and role (agent/user/system) pairs
agent	VARCHAR	Agent identifier or name executing the task
model	VARCHAR	Language model name used by the agent
date	VARCHAR	Timestamp or date when the trace was recorded
task	VARCHAR	Original task prompt or instruction text
episode	VARCHAR	Episode identifier for grouped interactions
run_id	VARCHAR	Unique identifier for a single execution run
trial_name	VARCHAR	Name or label for the trial/experiment
model_provider	VARCHAR	Provider of the language model (e.g., OpenAI, Anthropic)
original_source	VARCHAR	Source dataset or origin of the task (e.g., exp_rpt)
original_teacher	VARCHAR	Teacher model or reference model that generated the trace
result	VARCHAR	Result status or outcome of the agent's execution
trace_source	VARCHAR	Source system or framework that generated the trace
path	VARCHAR	File path or identifier for the task resource
task_binary	BLOB	Gzip-compressed binary encoding of task data
instruction	VARCHAR	Detailed instruction or prompt given to the agent
verifier_output	VARCHAR	Output from verification/evaluation of agent's solution
ground_truth	VARCHAR	Expected correct solution or reference output
judgment	VARCHAR	Human or automated judgment of solution correctness
__index_level_0__	BIGINT	Zero-indexed row number in the original dataset
split	VARCHAR	Dataset partition label (train/val/test)

Sample Data

Preview a sample of the data before downloading.

Free

Open dataset

Quality: No ratings

0 downloads

Seller: DataBazaar

Agent? No sign-up needed →

For AI Agents

Via MCP Server

# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
  "mcpServers": {
    "databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
  }
}

# 2. Your agent can then call:
search_datasets({ query: "AgentTrove: 1.7M Agentic Inter" })
// Found: 1d5b9795-bb25-4713-bb89-7975e43c7f9c
get_download_url({ dataset_id: "1d5b9795-bb25-4713-bb89-7975e43c7f9c" })  // free — no API key needed

Via REST API

# Free dataset — no API key required:
curl https://api.databazaar.io/datasets/1d5b9795-bb25-4713-bb89-7975e43c7f9c/download-url