textopen-thoughts/AgentTroveagentsagentic-tracescodereinforcement-learningfine-tuningtool-usesftapache-2.0

AgentTrove: 1.7M Agentic Interaction Traces

Category
Text
Records
1,696,847 rows
Format
PARQUET
Update Frequency
One-time snapshot
Collection Method
auto_imported_huggingface_federated
PII
None detected
File Size
~18646.57 MB
Downloads
0

About this data

Largest open-source collection of agentic interaction traces (1.7M rows from 219 source datasets) covering code repair, shell scripting, math, competitive programming, and computer-use tasks. Apache 2.0.

Schema

NameTypeDescription
conversationsSTRUCT("content" VARCHAR, "role" VARCHAR)[]Multi-turn agent-environment interaction with content (string) and role (agent/user/system) pairs
agentVARCHARAgent identifier or name executing the task
modelVARCHARLanguage model name used by the agent
dateVARCHARTimestamp or date when the trace was recorded
taskVARCHAROriginal task prompt or instruction text
episodeVARCHAREpisode identifier for grouped interactions
run_idVARCHARUnique identifier for a single execution run
trial_nameVARCHARName or label for the trial/experiment
model_providerVARCHARProvider of the language model (e.g., OpenAI, Anthropic)
original_sourceVARCHARSource dataset or origin of the task (e.g., exp_rpt)
original_teacherVARCHARTeacher model or reference model that generated the trace
resultVARCHARResult status or outcome of the agent's execution
trace_sourceVARCHARSource system or framework that generated the trace
pathVARCHARFile path or identifier for the task resource
task_binaryBLOBGzip-compressed binary encoding of task data
instructionVARCHARDetailed instruction or prompt given to the agent
verifier_outputVARCHAROutput from verification/evaluation of agent's solution
ground_truthVARCHARExpected correct solution or reference output
judgmentVARCHARHuman or automated judgment of solution correctness
__index_level_0__BIGINTZero-indexed row number in the original dataset
splitVARCHARDataset partition label (train/val/test)

Sample Data

Preview a sample of the data before downloading.

Free

Open dataset

Quality: No ratings
0 downloads
Seller: DataBazaar
Sign up to download

Agent? No sign-up needed →

For AI Agents

Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
  "mcpServers": {
    "databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
  }
}

# 2. Your agent can then call:
search_datasets({ query: "AgentTrove: 1.7M Agentic Inter" })
// Found: 1d5b9795-bb25-4713-bb89-7975e43c7f9c
get_download_url({ dataset_id: "1d5b9795-bb25-4713-bb89-7975e43c7f9c" })  // free — no API key needed
Via REST API
# Free dataset — no API key required:
curl https://api.databazaar.io/datasets/1d5b9795-bb25-4713-bb89-7975e43c7f9c/download-url