scientificjablonkagroup/ChemBenchchemistrybenchmarkevaluationllm-evalmaterials-sciencequestion-answeringmultiple-choiceexpert-curated

ChemBench — Chemistry & Materials LLM Evaluation Benchmark

Name: ChemBench — Chemistry & Materials LLM Evaluation Benchmark
Creator: DataBazaar
Keywords: jablonkagroup/ChemBench, chemistry, benchmark, evaluation, llm-eval, materials-science, question-answering, multiple-choice, expert-curated

About this data

Manually curated benchmark for evaluating chemistry and materials science capabilities of LLMs. Expert-generated QA and multiple-choice items. MIT licensed, evaluation-only.

Schema

Name	Type	Description
canary	VARCHAR	Deduplication string warning that benchmark data must not appear in training corpora, includes unique GUID
description	VARCHAR	Brief natural language summary of the evaluation item's topic or concept
examples	STRUCT("input" VARCHAR, "target" VARCHAR, target_scores VARCHAR)[]	Array of input-target pairs with scoring rubrics; input is the prompt/question, target is expected answer, target_scores maps answer options to correctness weights
in_humansubset_w_tool	BOOLEAN	Boolean flag indicating whether item was evaluated by human annotators using external tools or resources
in_humansubset_wo_tool	BOOLEAN	Boolean flag indicating whether item was evaluated by human annotators without external tools or resources
keywords	VARCHAR[]	Array of semantic tags describing task domain, difficulty level, required knowledge type, and assessment method
metrics	VARCHAR[]	Array of evaluation metric names applicable to this item (e.g. multiple_choice_grade, accuracy)
name	VARCHAR	Unique identifier or slug for the evaluation item within the benchmark
preferred_score	VARCHAR	Primary metric name recommended for scoring this specific item
uuid	VARCHAR	Universally unique identifier (v5 UUID) for the item
subfield	VARCHAR	Chemistry or materials science subdomain category (e.g. safety, synthesis, thermodynamics)

Sample Data

Preview a sample of the data before downloading.

Free

Open dataset

Quality: No ratings

2 downloads

Seller: DataBazaar

Agent? No sign-up needed →

For AI Agents

Via MCP Server

# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
  "mcpServers": {
    "databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
  }
}

# 2. Your agent can then call:
search_datasets({ query: "ChemBench — Chemistry & Materi" })
// Found: ee9efdad-1794-4ecb-89d8-693621deee1d
get_download_url({ dataset_id: "ee9efdad-1794-4ecb-89d8-693621deee1d" })  // free — no API key needed

Via REST API

# Free dataset — no API key required:
curl https://api.databazaar.io/datasets/ee9efdad-1794-4ecb-89d8-693621deee1d/download-url