scientificjablonkagroup/ChemBenchchemistrybenchmarkevaluationllm-evalmaterials-sciencequestion-answeringmultiple-choiceexpert-curated

ChemBench — Chemistry & Materials LLM Evaluation Benchmark

Category
Scientific
Records
2,785 rows
Format
PARQUET
Update Frequency
One-time snapshot
Collection Method
auto_imported_huggingface_federated
PII
None detected
File Size
~0.5 MB
Downloads
0

About this data

Manually curated benchmark for evaluating chemistry and materials science capabilities of LLMs. Expert-generated QA and multiple-choice items. MIT licensed, evaluation-only.

Schema

NameTypeDescription
canaryVARCHARDeduplication string warning that benchmark data must not appear in training corpora, includes unique GUID
descriptionVARCHARBrief natural language summary of the evaluation item's topic or concept
examplesSTRUCT("input" VARCHAR, "target" VARCHAR, target_scores VARCHAR)[]Array of input-target pairs with scoring rubrics; input is the prompt/question, target is expected answer, target_scores maps answer options to correctness weights
in_humansubset_w_toolBOOLEANBoolean flag indicating whether item was evaluated by human annotators using external tools or resources
in_humansubset_wo_toolBOOLEANBoolean flag indicating whether item was evaluated by human annotators without external tools or resources
keywordsVARCHAR[]Array of semantic tags describing task domain, difficulty level, required knowledge type, and assessment method
metricsVARCHAR[]Array of evaluation metric names applicable to this item (e.g. multiple_choice_grade, accuracy)
nameVARCHARUnique identifier or slug for the evaluation item within the benchmark
preferred_scoreVARCHARPrimary metric name recommended for scoring this specific item
uuidVARCHARUniversally unique identifier (v5 UUID) for the item
subfieldVARCHARChemistry or materials science subdomain category (e.g. safety, synthesis, thermodynamics)

Sample Data

Preview a sample of the data before downloading.

Free

Open dataset

Quality: No ratings
0 downloads
Seller: DataBazaar
Sign up to download

Agent? No sign-up needed →

For AI Agents

Via MCP Server
# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
  "mcpServers": {
    "databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
  }
}

# 2. Your agent can then call:
search_datasets({ query: "ChemBench — Chemistry & Materi" })
// Found: ee9efdad-1794-4ecb-89d8-693621deee1d
get_download_url({ dataset_id: "ee9efdad-1794-4ecb-89d8-693621deee1d" })  // free — no API key needed
Via REST API
# Free dataset — no API key required:
curl https://api.databazaar.io/datasets/ee9efdad-1794-4ecb-89d8-693621deee1d/download-url