textScaleAI/SWE-bench_Proswe-benchcoding-agentsbenchmarkevaluationsoftware-engineeringscale-aiagentslong-horizoncodellm-eval

SWE-bench Pro

Name: SWE-bench Pro
Creator: DataBazaar
Keywords: ScaleAI/SWE-bench_Pro, swe-bench, coding-agents, benchmark, evaluation, software-engineering, scale-ai, agents, long-horizon, code, llm-eval

About this data

Enterprise-level benchmark dataset from Scale AI for evaluating AI agents on long-horizon software engineering tasks. Follows SWE-Bench Verified structure with challenging real-world coding problems.

Schema

Name	Type	Description
repo	VARCHAR	Repository owner and name (e.g., 'NodeBB/NodeBB')
instance_id	VARCHAR	Unique identifier for the task instance combining repo, commit hash, and variant
base_commit	VARCHAR	Git commit hash representing the initial state before the fix
patch	VARCHAR	Unified diff format showing the complete solution changes required
test_patch	VARCHAR	Unified diff format for test file modifications needed to validate the fix
problem_statement	VARCHAR	Natural language description of the software engineering issue to resolve
requirements	VARCHAR	Specific constraints, dependencies, or implementation requirements for the task
interface	VARCHAR	API signatures, function definitions, or class interfaces that must be implemented
repo_language	VARCHAR	Primary programming language of the repository (e.g., 'JavaScript', 'Python')
fail_to_pass	VARCHAR	Test identifiers or commands that must transition from failing to passing state
pass_to_pass	VARCHAR	Test identifiers or commands that must remain passing throughout the solution
issue_specificity	VARCHAR	Categorical level describing precision of the problem scope (e.g., 'high', 'medium')
issue_categories	VARCHAR	Comma-separated tags classifying issue type (e.g., 'bug', 'feature', 'refactor')
before_repo_set_cmd	VARCHAR	Shell command(s) to execute setup or initialization in the repository context
selected_test_files_to_run	VARCHAR	Comma-separated paths to test files used for validating the solution
dockerhub_tag	VARCHAR	Docker image reference specifying environment and dependencies for evaluation

Sample Data

Preview a sample of the data before downloading.

Free

Open dataset

Quality: No ratings

3 downloads

Seller: DataBazaar

Agent? No sign-up needed →

For AI Agents

Via MCP Server

# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
  "mcpServers": {
    "databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
  }
}

# 2. Your agent can then call:
search_datasets({ query: "SWE-bench Pro" })
// Found: de2141e6-1a79-421b-a389-801739457e65
get_download_url({ dataset_id: "de2141e6-1a79-421b-a389-801739457e65" })  // free — no API key needed

Via REST API

# Free dataset — no API key required:
curl https://api.databazaar.io/datasets/de2141e6-1a79-421b-a389-801739457e65/download-url