textrajpurkar/squad_v2question-answeringreading-comprehensionextractive-qarag-evaluationnlp-benchmarkwikipediaenglishfine-tuning

SQuAD 2.0 - Stanford Question Answering Dataset

Name: SQuAD 2.0 - Stanford Question Answering Dataset
Creator: DataBazaar
Keywords: rajpurkar/squad_v2, question-answering, reading-comprehension, extractive-qa, rag-evaluation, nlp-benchmark, wikipedia, english, fine-tuning

About this data

Reading comprehension benchmark with 150K+ questions on Wikipedia articles, including 50K unanswerable questions. Standard for extractive QA model training and evaluation.

Schema

Name	Type	Description
id	VARCHAR	Unique identifier for the question-passage pair in SQuAD 2.0.
title	VARCHAR	Wikipedia article title from which the passage was extracted.
context	VARCHAR	Full Wikipedia passage text that may contain the answer to the question.
question	VARCHAR	Natural language question about the passage; may be answerable or adversarially unanswerable.
answers	STRUCT("text" VARCHAR[], answer_start INTEGER[])	Object containing 'text' (list of answer spans) and 'answer_start' (character offsets in context); empty if unanswerable.

Sample Data

Preview a sample of the data before downloading.

Free

Open dataset

Quality: No ratings

0 downloads

Seller: DataBazaar

Agent? No sign-up needed →

For AI Agents

Via MCP Server

# 1. Add to your agent's MCP config (claude_desktop_config.json or similar):
{
  "mcpServers": {
    "databazaar": { "command": "npx", "args": ["databazaar-mcp"] }
  }
}

# 2. Your agent can then call:
search_datasets({ query: "SQuAD 2.0 - Stanford Question " })
// Found: 0a379db4-4ab9-467f-a801-c871dfacfe48
get_download_url({ dataset_id: "0a379db4-4ab9-467f-a801-c871dfacfe48" })  // free — no API key needed

Via REST API

# Free dataset — no API key required:
curl https://api.databazaar.io/datasets/0a379db4-4ab9-467f-a801-c871dfacfe48/download-url