textbigcode/bigcodebenchcode-generationbenchmarkevaluationpythonllm-evalbigcodeagent-evalapache-2.0

BigCodeBench — Code Generation Benchmark (Complete & Instruct)

Name: BigCodeBench — Code Generation Benchmark (Complete & Instruct)
Creator: DataBazaar
Keywords: bigcode/bigcodebench, code-generation, benchmark, evaluation, python, llm-eval, bigcode, agent-eval, apache-2.0

About this data

1,140-task code generation benchmark from BigCode with both docstring-based completion and NL-instruction variants, 99% test coverage, Apache-2.0 licensed.

Schema

Name	Type	Description
task_id	VARCHAR
complete_prompt	VARCHAR
instruct_prompt	VARCHAR
canonical_solution	VARCHAR
code_prompt	VARCHAR
test	VARCHAR
entry_point	VARCHAR
doc_struct	VARCHAR
libs	VARCHAR

Sample Data

Preview a sample of the data before downloading.

Free

Open dataset

Quality: No ratings

1 downloads

Seller: DataBazaar

Agent? No sign-up needed →

For AI Agents

Via MCP Server

# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
  "mcpServers": {
    "databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
  }
}

# 2. Your agent can then call:
search_datasets({ query: "BigCodeBench — Code Generation" })
// Found: 8bb019dc-566b-4e6c-81bb-6adc1a97ab32
get_download_url({ dataset_id: "8bb019dc-566b-4e6c-81bb-6adc1a97ab32" })  // free — no API key needed

Via REST API

# Free dataset — no API key required:
curl https://api.databazaar.io/datasets/8bb019dc-566b-4e6c-81bb-6adc1a97ab32/download-url